DeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QA

Large language model (LLM) agents still struggle with long-term memory question answering, where answer-supporting evidence is often scattered across long conversational histories and buried in substantial irrelevant content. Existing memory systems typically process memory before future queries are...

Read Original Article →

Source

http://arxiv.org/abs/2605.22411v1