All-Mem: Agentic Lifelong Memory via Dynamic Topology Evolution

AuthorsCan Lv, Heng Chang, Yuchen Guo et al.

2026

TL;DR

All-Mem uses agentic topology consolidation over a visible memory surface to reach 54.63 4o-J on LoCoMo, +5.72 over Mem0.

SharePost on X LinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

Lifelong agents accumulate noisy, outdated memories that pollute retrieval

As histories grow without bound, redundant, outdated, or noisy retrieved contexts increasingly crowd a fixed budget and yield diffuse or misleading contexts.

This breaks long-horizon conversational agents that rely on flat similarity search, causing incorrect reasoning when superseded or entangled memories dominate limited context windows.

HOW IT WORKS

All-Mem — Online/Offline Decoupling with Agentic Topology Consolidation

All-Mem combines Online/Offline Decoupling, Agentic Topology Consolidation, and Topology-Aware Retrieval over a topology-structured memory bank with a curated visible surface.

You can think of All-Mem like fast RAM for the visible surface plus a slower but richer disk archive, with an LLM-based janitor reorganizing files offline.

This design lets All-Mem keep online latency low while enabling non-destructive SPLIT, MERGE, and UPDATE operations that a plain context window or append-only store cannot support.

DIAGRAM

Topology-Aware Retrieval Pipeline at Query Time

This diagram shows how All-Mem performs Stage 1 anchoring, Stage 2 budgeted typed-link expansion, and Stage 3 final selection during topology-aware retrieval.

DIAGRAM

Online/Offline Decoupling and Agentic Topology Consolidation

This diagram shows how All-Mem separates low-latency online writing from periodic offline Agentic Topology Consolidation using SPLIT, MERGE, and UPDATE.

PROCESS

How All-Mem Handles a Long-Horizon Interaction Session

01
Online Phase
In the Online Phase, All-Mem performs lightweight Unit Writing, Surface Linking to the visible surface, and buffering of unit ids for later consolidation.
02
Agentic Topology Consolidation
During Agentic Topology Consolidation, All-Mem runs parallel diagnosis over buffered contexts, applies confidence-gating, and routes targets into SPLIT, MERGE, and UPDATE queues.
03
Topology Editing
In Topology Editing, All-Mem executes SPLIT, MERGE, and UPDATE in fixed order, archives superseded units, rewires typed links, and preserves versioned traceability to immutable evidence.
04
Topology-Aware Retrieval
During Topology-Aware Retrieval, All-Mem anchors on the visible surface, performs hop-bounded typed-link expansion, and selects a budgeted evidence set for the agent.

KEY CONTRIBUTIONS

Key Contributions

01
Agentic Topology Consolidation
All-Mem introduces Agentic Topology Consolidation that uses SPLIT, MERGE, and UPDATE to non-destructively reorganize the topology-structured memory bank while preserving immutable evidence.
02
Non-Destructive Topology Editing
All-Mem defines SPLIT, MERGE, and UPDATE operators with versioned traceability, ensuring every archived unit remains reachable within a small hop budget from the visible surface.
03
Topology-Aware Retrieval
All-Mem proposes Topology-Aware Retrieval that anchors on a visible surface and uses hop-bounded expansion, achieving 94.68 R@5 on LongMemEval-S under explicit retrieval budgets.

RESULTS

By the Numbers

4o-J

54.63

+5.72 over Mem0 on LoCoMo

52.18

+9.10 over Mem0 on LoCoMo

R@5

46.63

+7.89 over Mem0 on LoCoMo

4o-J

60.20

+4.40 over Mem0 on LongMemEval-S

On LoCoMo and LongMemEval-S, which test long-horizon conversational memory and retrieval, All-Mem consistently achieves higher 4o-J, F1, and R@5 than Mem0 and other baselines. These gains show that All-Mem’s non-destructive consolidation and topology-aware retrieval improve evidence selection under fixed context budgets.

BENCHMARK

By the Numbers

BENCHMARK

Main Results on LoCoMo (4o-J)

4o-J on LoCoMo for All-Mem and representative memory baselines.

KEY INSIGHT

The Counterintuitive Finding

All-Mem uses 918 tokens per query at answer time on LoCoMo, while Mem0 injects 1764 tokens yet achieves lower 4o-J and F1.

This is surprising because many assume feeding more context always helps, but All-Mem shows that curated, consolidated memories beat larger, noisier prompts.

WHY IT MATTERS

What this unlocks for the field

All-Mem unlocks lifelong agents that maintain clean, recoverable memories over hundreds of turns without sacrificing latency or losing provenance.

Builders can now design agents that safely reorganize memory offline, keep online prompts small, and still recover archived evidence through explicit versioned links.

~12 min read← Back to papers

Related papers

Agent MemoryLong-Term Memory

Adaptive Memory Admission Control for LLM Agents

Guilin Zhang, Wei Jiang et al.

· 2026

A-MAC scores candidate memories using Utility, Confidence, Novelty, Recency, and Type Prior combined by a learned linear admission policy with Algorithm 1 A-MAC Memory Admission. On the LoCoMo benchmark, A-MAC achieves F1 0.583 and 2644 ms latency, improving F1 by 0.042 and reducing latency by 1187 ms compared to A-mem.

arXiv:2603.04549 Read explainer

Long-Term Memory

Advancing Open-source World Models

Robbyant Team, Zelin Gao et al.

arXiv 2026 · 2026

LingBot-World combines a Data Engine, Fundamental World Model, Action-Conditioned World Model, and Post-Training causal adaptation to turn a 28B-parameter video generator into a real-time interactive world simulator. On the VBench benchmark, LingBot-World achieves a dynamic degree of 0.8857 versus 0.7612 for Yume-1.5, while also improving imaging quality to 0.6683.

arXiv:2601.20540 Read explainer

BenchmarkBenchmarkLong-Term Memory

AgenticAI-DialogGen: Topic-Guided Conversation Generation for Fine-Tuning and Evaluating Short- and Long-Term Memories of LLMs

Manoj Madushanka Perera, Adnan Mahmood et al.

· 2026

AgenticAI-DialogGen chains ChatPreprocessor, KnowledgeExtractor, TopicAnalyzer, KnowledgeGraphBuilder, PersonaGenerator, DuelingChat Agent, ConversationValidator, ConversationRefiner, QAGeneration, and PostProcessing to turn raw multi-session chats into topic-guided, persona-grounded conversations with explicit short- and long-term memories. On the TGC / KG memory QA benchmark, Mistral-7B fine-tuned within AgenticAI-DialogGen achieves 87.36 F1, compared to GPT-4’s 83.77 F1 in a zero-shot setting on the same task.

arXiv:2604.12179 Read explainer

Questions about this paper?

Answers use this explainer on Memory Papers.

Checking…