Category

Memory Architecture

Architectures and systems for organizing, storing, and accessing AI memory.

7 papers

Memory ArchitectureSurvey

Multi-Agent Memory from a Computer Architecture Perspective: Visions and Challenges Ahead

Zhongming Yu, Naicheng Yu et al.

arXiv 2026 · 2026

Multi-Agent Memory Architecture organizes **Agent IO Layer**, **Agent Cache Layer**, and **Agent Memory Layer** plus **Agent Cache Sharing** and **Agent Memory Access** protocols into a unified architectural framing for multi-agent systems. The position-only SYS_NAME proposes no benchmark MAIN_RESULT or numeric comparison against any baseline.

RAGMemory ArchitectureLong-Term Memory

From RAG to Memory: Non-Parametric Continual Learning for Large Language Models

Bernal Jiménez Gutiérrez, Yiheng Shu et al.

ICML 2025 · 2025

HippoRAG 2 combines **Offline Indexing**, a schema-less **Knowledge Graph**, **Dense-Sparse Integration**, **Deeper Contextualization**, and **Recognition Memory** into a neuro-inspired non-parametric memory system for LLMs. On the joint RAG benchmark suite, HippoRAG 2 achieves 59.8 average F1 versus 57.0 for NV-Embed-v2, including 71.0 F1 on 2Wiki compared to 61.5 for NV-Embed-v2.

Agent MemoryMemory Architecture

General Agentic Memory Via Deep Research

B.Y. Yan, Chaofan Li et al.

arXiv 2025 · 2025

General Agentic Memory (GAM) combines a **Memorizer**, **Researcher**, **page-store**, and **memory** to keep full trajectories while constructing lightweight guidance for deep research. On RULER 128K retrieval, GAM achieves 97.70% accuracy compared to 94.25% for RAG using GPT-4o-mini, while also reaching 64.07 F1 on HotpotQA-56K.

Agent MemoryLong-Term MemoryMemory Architecture

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

Prateek Chhikara, Dev Khant et al.

arXiv 2025 · 2025

Mem0 incrementally processes conversations using the **extraction phase**, **update phase**, **asynchronous summary generation module**, **tool call mechanism**, and a **vector database** to build scalable long-term memory. On the LOCOMO benchmark, Mem0 attains a J score of 67.13 on single-hop questions versus 63.79 for OpenAI and cuts p95 latency from 17.117s to 1.440s compared to the full-context baseline.

SurveyMemory Architecture

Memory-Augmented Transformers: A Systematic Review from Neuroscience Principles to Enhanced Model Architectures

Parsa Omidi, Xingshuai Huang et al.

arXiv 2025 · 2025

Memory-Augmented Transformers organizes **functional objectives**, **memory types**, and **integration techniques** into a three-axis taxonomy, grounded in biological systems like sensory, working, and long-term memory. The survey synthesizes dozens of architectures to highlight emerging mechanisms such as hierarchical buffering and surprise-gated updates that move beyond static KV caches.

Memory ArchitectureAgent Memory

MemOS: A Memory OS for AI System

Zhiyu Li, Chenyang Xi et al.

arXiv 2025 · 2025

MemOS organizes memory via **MemReader**, **MemScheduler**, **MemLifecycle**, **MemOperator**, and **MemGovernance**, all operating over MemCube units that unify plaintext, activation, and parameter memories under OS-style control. On PreFEval, PersonaMem, LongMemEval, and LoCoMo, MemOS-1031 ranks first across all metrics compared to MIRIX, Mem0, Zep, Memobase, MemU, and Supermemory.

Memory Architecture

Titans: Learning to Memorize at Test Time

Ali Behrouz, Peilin Zhong, Vahab Mirrokni

arXiv 2025 · 2025

Titans combines a **Neural Memory Module**, **Core** short term attention, and **Persistent Memory** into three variants (Memory as a Context, Memory as a Gate, Memory as a Layer) that learn to memorize at test time. On LAMBADA, Titans (MAC) reaches 39.62% accuracy at 760M parameters, compared to 37.06% for DeltaNet and 39.72% for Samba while also improving long context NIAH performance.