Memory in the Age of AI Agents

AuthorsYuyang Hu, Shichun Liu, Yanwei Yue et al.

2025

TL;DR

Memory in the Age of AI Agents unifies token-level, parametric, and latent memory into a Forms–Functions–Dynamics taxonomy, organizing dozens of agent memory systems into a single conceptual landscape.

SharePost on X LinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

Agent memory is fragmented and long or short term is not enough

Memory in the Age of AI Agents observes that traditional long or short term taxonomies cannot capture the diversity and dynamics of modern agent memory systems.

Memory in the Age of AI Agents shows that works labeled agent memory differ drastically in motivations, implementations, and evaluation protocols, making it hard for practitioners to design and compare memory systems.

HOW IT WORKS

Forms–Functions–Dynamics taxonomy for agent memory

Memory in the Age of AI Agents introduces a core mechanism built on Memory Formation, Memory Evolution, Memory Retrieval, and three forms Token level Memory, Parametric Memory, and Latent Memory.

Memory in the Age of AI Agents is like a cognitive operating system: Memory Formation logs experiences, Memory Evolution consolidates and forgets, and Memory Retrieval acts like a card catalog for long lived agents.

This design lets Memory in the Age of AI Agents explain capabilities that a plain context window cannot, such as cross task experiential memory, multi episode factual memory, and structured working memory management.

DIAGRAM

Unified forms–functions taxonomy of agent memory

This diagram shows how Memory in the Age of AI Agents maps memory forms to functional roles like factual, experiential, and working memory.

DIAGRAM

Evaluation and resource landscape for agent memory

This diagram shows how Memory in the Age of AI Agents organizes benchmarks and frameworks used to study agent memory.

PROCESS

How Memory in the Age of AI Agents Handles an Agent Memory Lifecycle

01
Memory Formation
Memory in the Age of AI Agents uses Memory Formation to transform interaction artifacts into memory candidates via semantic summarization, knowledge distillation, and structured construction.
02
Memory Evolution
Memory in the Age of AI Agents applies Memory Evolution to consolidate, update, and forget entries, maintaining a coherent and efficient memory state over time.
03
Memory Retrieval
Memory in the Age of AI Agents defines Memory Retrieval to construct queries, choose retrieval strategies, and perform post retrieval processing for agent policies.
04
Positions and Frontiers
Memory in the Age of AI Agents then explores frontiers like Automated Memory Management, Reinforcement Learning Meets Agent Memory, Multimodal Memory, and Shared Memory in Multi Agent Systems.

KEY CONTRIBUTIONS

Key Contributions

01
Forms–Functions–Dynamics Taxonomy
Memory in the Age of AI Agents proposes a Forms–Functions–Dynamics taxonomy that links Token level Memory, Parametric Memory, Latent Memory, and operators Memory Formation, Memory Evolution, and Memory Retrieval into one framework.
02
Interplay of Memory Forms and Functions
Memory in the Age of AI Agents analyzes how Factual Memory, Experiential Memory, and Working Memory align with token level, parametric, and latent forms across diverse agent tasks.
03
Positions and Frontiers
Memory in the Age of AI Agents identifies frontiers such as Automated Memory Management, Reinforcement Learning Meets Agent Memory, Multimodal Memory, Shared Memory in Multi Agent Systems, and Trustworthy Memory as key future directions.

RESULTS

By the Numbers

Benchmarks covered

20+ benchmarks

covers long dialogue, lifelong agents, code, and research tasks compared to earlier surveys

Frameworks listed

10+ frameworks

summarizes open source memory frameworks beyond prior LLM memory work

Memory forms

3 core forms

Token level, Parametric, Latent defined under one taxonomy

Memory functions

3 core functions

Factual, Experiential, Working unified across agents

Memory in the Age of AI Agents is a survey without new benchmark scores, so the main quantitative context is the breadth of benchmarks and frameworks it consolidates. These counts show that Memory in the Age of AI Agents systematically covers the landscape rather than focusing on a single memory mechanism.

BENCHMARK

By the Numbers

BENCHMARK

Memory taxonomy coverage across forms and functions

Relative emphasis of memory forms and functions discussed in Memory in the Age of AI Agents.

KEY INSIGHT

The Counterintuitive Finding

Memory in the Age of AI Agents argues that traditional long or short term labels are insufficient, even though these categories dominate prior memory work.

This is counterintuitive because many practitioners assume long context models or RAG solve memory, but Memory in the Age of AI Agents shows agents still need explicit forms, functions, and dynamics.

WHY IT MATTERS

What this unlocks for the field

Memory in the Age of AI Agents gives builders a vocabulary to design Token level, Parametric, and Latent memory aligned with Factual, Experiential, and Working roles.

With this structure, practitioners can now systematically choose memory forms, operators, and evaluation benchmarks instead of improvising ad hoc memory buffers and RAG stacks.

~15 min read← Back to papers

Related papers

RAGBenchmarkAgent MemoryMemory Architecture

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Xingyu Lyu, Jianfeng He et al.

· 2026

ADAM combines Anchor extraction, Distribution estimation, Anchor selection, and Query generation to adaptively probe agent memory via an auxiliary generator and entropy based selection. On the EHRAgent benchmark with Llama2-7b-chat, ADAM reaches EQ=77 and ASR=1.00, compared to MEXTRA’s EQ=44 and ASR=0.89.

arXiv:2604.09747 Read explainer

RAG

A Dynamic Retrieval-Augmented Generation System with Selective Memory and Remembrance

Okan Bursa

· 2026

Adaptive RAG Memory (ARM) augments a standard retriever–generator stack with a Dynamic Embedding Layer and Remembrance Engine that track usage statistics and apply selective remembrance and decay to embeddings. On a lightweight retrieval benchmark, ARM achieves NDCG@5 ≈ 0.9401 and Recall@5 = 1.000 with 22M parameters, matching larger baselines like gte-small while providing the best efficiency among ultra-efficient models.

arXiv:2601.02428 Read explainer

RAGLong-Term Memory

HingeMem: Boundary Guided Long-Term Memory with Query Adaptive Retrieval for Scalable Dialogues

Yijie Zhong, Yunfan Gao, Haofen Wang

· 2026

HingeMem combines Boundary Guided Long-Term Memory, Dialogue Boundary Extraction, Memory Construction, Query Adaptive Retrieval, Hyperedge Rerank, and Adaptive Stop to segment dialogues into element-indexed hyperedges and plan query-specific retrieval. On LOCOMO, HingeMem achieves 63.9 overall F1 and 75.1 LLM-as-a-Judge score, surpassing the best baseline Zep (56.9 F1) by 7.0 F1 without using category-specific QA formats.

arXiv:2604.06845 Read explainer

Questions about this paper?

Answers use this explainer on Memory Papers.

Checking…