Adaptive Memory Admission Control for LLM Agents

AuthorsGuilin Zhang, Wei Jiang, Xiejiashan Wang et al.

2026

TL;DR

A-MAC uses five interpretable admission signals with a learned linear policy to reach F1 0.583 on LoCoMo while cutting latency 31% vs A-mem.

SharePost on X LinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

LLM agents lack controllable long term memory admission

LLM-based agents either accumulate large volumes of conversational content, including hallucinated or obsolete facts, or depend on opaque, fully LLM-driven memory policies.

This makes long-term memory a weakly controlled component, so agents like MemGPT and A-mem risk memory bloat, hallucination propagation, and degraded retrieval latency over time.

HOW IT WORKS

Adaptive Memory Admission Control A-MAC

A-MAC treats memory admission as a decision problem, combining Utility, Confidence, Novelty, Recency, and Type Prior via a learned linear score and Algorithm 1 A-MAC Memory Admission.

Think of A-MAC like an OS page scheduler plus a librarian: it checks usefulness, trustworthiness, redundancy, freshness, and category before shelving any memory.

This explicit admission control lets A-MAC keep long-term memory compact, reliable, and auditable in ways a plain context window or fully LLM-native policy cannot.

DIAGRAM

End to end admission flow for a single memory candidate

This diagram shows how A-MAC processes a candidate memory from extraction through feature computation to admit, update, or reject decisions.

DIAGRAM

LoCoMo evaluation and ablation pipeline

This diagram shows how A-MAC is trained and evaluated on LoCoMo with cross validated weight learning and feature ablations.

PROCESS

How A-MAC Handles a Multi Turn Conversation Session

01
Memory admission as a decision problem
A-MAC takes conversation history H and existing memory store M, extracts atomic candidate memories m, and frames admission as scoring with S(m).
02
Interpretable memory value signals
A-MAC computes Utility U with a single LLM call and derives Confidence C, Novelty N, Recency R, and Type Prior T using rule based features.
03
Policy learning and admission rule
A-MAC learns weights ω and threshold θ via 5 fold cross validation and grid search to maximize F1 on labeled admission decisions.
04
Algorithm 1 A-MAC Memory Admission
A-MAC applies Algorithm 1 to aggregate features, compare S(m) to θ, detect conflicts in M, and either admit, merge, or reject each candidate.

KEY CONTRIBUTIONS

Key Contributions

01
Adaptive Memory Admission Control A-MAC
A-MAC formalizes memory admission as a structured decision problem with a linear score over Utility, Confidence, Novelty, Recency, and Type Prior, implemented in Algorithm 1 A-MAC Memory Admission.
02
Interpretable five factor value signals
A-MAC decomposes memory value into Utility U, Confidence C, Novelty N, Recency R, and Type Prior T, enabling explicit control over reliability, redundancy, and persistence.
03
Efficient hybrid design and empirical gains
A-MAC reaches F1 0.583 with 2644 ms latency on LoCoMo, improving F1 by 0.042 and cutting latency by 31% compared to A-mem while identifying Type Prior as the dominant feature.

RESULTS

By the Numbers

0.583

+0.042 over A-mem

Prec.

0.417

+0.046 over A-mem

Recall

0.972

-0.028 vs A-mem

Lat.(ms)

2644

-1187 ms vs A-mem

On the LoCoMo benchmark with 225 test samples, A-MAC improves F1 from 0.541 to 0.583 over A-mem while reducing latency from 3831 ms to 2644 ms. This shows A-MAC can be both more accurate and more efficient than fully LLM native memory systems for long term agent memory admission.

BENCHMARK

By the Numbers

BENCHMARK

Performance comparison on LoCoMo test set

F1 on LoCoMo for A-MAC and baseline memory admission policies.

BENCHMARK

Ablation study showing performance impact of removing each feature

F1 on LoCoMo when dropping each A-MAC feature from the admission score.

KEY INSIGHT

The Counterintuitive Finding

Removing Type Prior drops A-MAC’s F1 from 0.583 to 0.476, a 0.107 decrease that is larger than removing any other feature.

This is surprising because many builders assume semantic Utility or Novelty dominate, but A-MAC shows simple content category priors can be the strongest admission signal.

WHY IT MATTERS

What this unlocks for the field

A-MAC unlocks controllable, interpretable long term memory admission where every stored fact is scored along Utility, Confidence, Novelty, Recency, and Type Prior.

Builders can now tune and audit memory policies per domain, trading precision, recall, and latency explicitly instead of relying on opaque LLM heuristics or unbounded memory growth.

~11 min read← Back to papers

Related papers

BenchmarkAgent Memory

Active Context Compression: Autonomous Memory Management in LLM Agents

Nikhil Verma

· 2026

Focus Agent adds start_focus, complete_focus, a persistent Knowledge block, and an optimized Persistent Bash plus String-Replace Editor scaffold to actively compress context during long software-engineering tasks. On five hard SWE-bench Lite instances against a Baseline ReAct agent, Focus Agent achieves 22.7% token reduction (14.9M → 11.5M) while matching 3/5 = 60% task success.

arXiv:2601.07190 Read explainer

Agent Memory

ActMem: Bridging the Gap Between Memory Retrieval and Reasoning in LLM Agents

Xiaohui Zhang, Zequn Sun et al.

· 2026

ActMem transforms dialogue history into atomic facts via Memory Fact Extraction, groups them with Fact Clustering, links them through a Memory KG Construction module, and uses Counterfactual-based Retrieval and Reasoning for action-aware answers. On ActMemEval, ActMem reaches 76.52% QA accuracy with DeepSeek-V3, beating LightMem’s 63.97% by 12.55 points and NaiveRAG’s 61.54%.

arXiv:2603.00026 Read explainer

RAGBenchmarkAgent MemoryMemory Architecture

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Xingyu Lyu, Jianfeng He et al.

· 2026

ADAM combines Anchor extraction, Distribution estimation, Anchor selection, and Query generation to adaptively probe agent memory via an auxiliary generator and entropy based selection. On the EHRAgent benchmark with Llama2-7b-chat, ADAM reaches EQ=77 and ASR=1.00, compared to MEXTRA’s EQ=44 and ASR=0.89.

arXiv:2604.09747 Read explainer

Questions about this paper?

Answers use this explainer on Memory Papers.

Checking…