All-Mem: Agentic Lifelong Memory via Dynamic Topology Evolution

AuthorsCan Lv, Heng Chang, Yuchen Guo et al.

2026

TL;DR

All-Mem uses agentic topology consolidation over a visible memory surface to reach 54.63 4o-J on LoCoMo, +5.72 over Mem0.

SharePost on XLinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

Lifelong agents accumulate noisy, outdated memories that pollute retrieval

As histories grow without bound, redundant, outdated, or noisy retrieved contexts increasingly crowd a fixed budget and yield diffuse or misleading contexts.

This breaks long-horizon conversational agents that rely on flat similarity search, causing incorrect reasoning when superseded or entangled memories dominate limited context windows.

HOW IT WORKS

All-Mem — Online/Offline Decoupling with Agentic Topology Consolidation

All-Mem combines Online/Offline Decoupling, Agentic Topology Consolidation, and Topology-Aware Retrieval over a topology-structured memory bank with a curated visible surface.

You can think of All-Mem like fast RAM for the visible surface plus a slower but richer disk archive, with an LLM-based janitor reorganizing files offline.

This design lets All-Mem keep online latency low while enabling non-destructive SPLIT, MERGE, and UPDATE operations that a plain context window or append-only store cannot support.

DIAGRAM

Topology-Aware Retrieval Pipeline at Query Time

This diagram shows how All-Mem performs Stage 1 anchoring, Stage 2 budgeted typed-link expansion, and Stage 3 final selection during topology-aware retrieval.

DIAGRAM

Online/Offline Decoupling and Agentic Topology Consolidation

This diagram shows how All-Mem separates low-latency online writing from periodic offline Agentic Topology Consolidation using SPLIT, MERGE, and UPDATE.

PROCESS

How All-Mem Handles a Long-Horizon Interaction Session

  1. 01

    Online Phase

    In the Online Phase, All-Mem performs lightweight Unit Writing, Surface Linking to the visible surface, and buffering of unit ids for later consolidation.

  2. 02

    Agentic Topology Consolidation

    During Agentic Topology Consolidation, All-Mem runs parallel diagnosis over buffered contexts, applies confidence-gating, and routes targets into SPLIT, MERGE, and UPDATE queues.

  3. 03

    Topology Editing

    In Topology Editing, All-Mem executes SPLIT, MERGE, and UPDATE in fixed order, archives superseded units, rewires typed links, and preserves versioned traceability to immutable evidence.

  4. 04

    Topology-Aware Retrieval

    During Topology-Aware Retrieval, All-Mem anchors on the visible surface, performs hop-bounded typed-link expansion, and selects a budgeted evidence set for the agent.

KEY CONTRIBUTIONS

Key Contributions

  • 01

    Agentic Topology Consolidation

    All-Mem introduces Agentic Topology Consolidation that uses SPLIT, MERGE, and UPDATE to non-destructively reorganize the topology-structured memory bank while preserving immutable evidence.

  • 02

    Non-Destructive Topology Editing

    All-Mem defines SPLIT, MERGE, and UPDATE operators with versioned traceability, ensuring every archived unit remains reachable within a small hop budget from the visible surface.

  • 03

    Topology-Aware Retrieval

    All-Mem proposes Topology-Aware Retrieval that anchors on a visible surface and uses hop-bounded expansion, achieving 94.68 R@5 on LongMemEval-S under explicit retrieval budgets.

RESULTS

By the Numbers

4o-J

54.63

+5.72 over Mem0 on LoCoMo

F1

52.18

+9.10 over Mem0 on LoCoMo

R@5

46.63

+7.89 over Mem0 on LoCoMo

4o-J

60.20

+4.40 over Mem0 on LongMemEval-S

On LoCoMo and LongMemEval-S, which test long-horizon conversational memory and retrieval, All-Mem consistently achieves higher 4o-J, F1, and R@5 than Mem0 and other baselines. These gains show that All-Mem’s non-destructive consolidation and topology-aware retrieval improve evidence selection under fixed context budgets.

BENCHMARK

By the Numbers

On LoCoMo and LongMemEval-S, which test long-horizon conversational memory and retrieval, All-Mem consistently achieves higher 4o-J, F1, and R@5 than Mem0 and other baselines. These gains show that All-Mem’s non-destructive consolidation and topology-aware retrieval improve evidence selection under fixed context budgets.

BENCHMARK

Main Results on LoCoMo (4o-J)

4o-J on LoCoMo for All-Mem and representative memory baselines.

KEY INSIGHT

The Counterintuitive Finding

All-Mem uses 918 tokens per query at answer time on LoCoMo, while Mem0 injects 1764 tokens yet achieves lower 4o-J and F1.

This is surprising because many assume feeding more context always helps, but All-Mem shows that curated, consolidated memories beat larger, noisier prompts.

WHY IT MATTERS

What this unlocks for the field

All-Mem unlocks lifelong agents that maintain clean, recoverable memories over hundreds of turns without sacrificing latency or losing provenance.

Builders can now design agents that safely reorganize memory offline, keep online prompts small, and still recover archived evidence through explicit versioned links.

~12 min read← Back to papers

Related papers

Long-Term Memory

Adaptive Memory Admission Control for LLM Agents

Guilin Zhang, Wei Jiang et al.

· 2026

A-MAC scores candidate memories using Utility, Confidence, Novelty, Recency, and Type Prior combined by a learned linear admission policy with Algorithm 1 A-MAC Memory Admission. On the LoCoMo benchmark, A-MAC achieves F1 0.583 and 2644 ms latency, improving F1 by 0.042 and reducing latency by 1187 ms compared to A-mem.

Agent MemoryLong-Term Memory

Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents

Yi Yu, Liuyi Yao et al.

arXiv 2026 · 2026

Agentic Memory (AgeMem) exposes memory management tools, a three-stage progressive RL strategy, and step-wise GRPO directly inside the agent policy to jointly control long-term and short-term memory. On Qwen3-4B-Instruct, AgeMem attains 54.31% average performance across ALFWorld, SciWorld, PDDL, BabyAI, and HotpotQA, exceeding the best baseline A-Mem at 45.74%.

Long-Term Memory

AlpsBench: An LLM Personalization Benchmark for Real-Dialogue Memorization and Preference Alignment

Jianfei Xiao, Xiang Yu et al.

· 2026

AlpsBench combines Personalized Information Extraction, Personalized Information Update, Personalized Information Retrieval, and Personalized Information Utilization over 2,500 WildChat dialogues with human-verified structured memories. AlpsBench shows, for example, that Gemini-3 Flash scores 51.67 on Task 1 Extraction while DeepSeek Reasoner reaches 0.9569 retrieval recall with 100 distractors on AlpsBench.

Questions about this paper?

Paper: All-Mem: Agentic Lifelong Memory via Dynamic Topology Evolution

Answers use this explainer on Memory Papers.

Checking…