A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory

AuthorsQianshan Wei, Tengchao Yang, Yaochen Wang et al.

2025

TL;DR

A-MemGuard uses consensus-based validation and a dual-memory lesson store to cut memory attack success rates by over 95% with minimal utility loss.

SharePost on XLinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

Context-dependent memory poisoning and 66% missed poisoned entries

Agent Security Bench shows that even advanced LLM-based detectors miss 66% of poisoned memory entries because they look harmless in isolation.

In knowledge-intensive QA and healthcare agents, these context-dependent records corrupt reasoning and create self-reinforcing error cycles, where each wrong action becomes a trusted precedent.

HOW IT WORKS

A-MemGuard — consensus validation plus dual-memory lessons

A-MemGuard builds on consensus-based validation, dual-memory structure, lesson memory, and path divergence scoring to detect and neutralize poisoned memories before actions are executed.

You can think of consensus-based validation as multiple witnesses cross-checking a story, while the lesson memory acts like a blacklist of past reasoning mistakes.

This design lets A-MemGuard catch context-triggered anomalies and reuse structured “negative lessons,” something a plain context window or isolated content filter cannot achieve.

DIAGRAM

Query-time consensus validation and lesson-guided revision

This diagram shows how A-MemGuard processes a single query through parallel reasoning paths, consensus validation, lesson distillation, and action revision.

DIAGRAM

Evaluation pipeline across tasks and attacks

This diagram shows how A-MemGuard is evaluated on direct and indirect memory attacks plus multi-agent misinformation.

PROCESS

How A-MemGuard Handles a Memory-Augmented Agent Query

  1. 01

    Parallel Reasoning Path Generation

    A-MemGuard uses consensus-based validation to call Λ and build structured reasoning paths ˆρi from each retrieved memory in Mr for the current query.

  2. 02

    Path Divergence Scoring and Validation

    A-MemGuard applies path divergence scoring Sdiv over the set of paths ˆPt, filtering memories into the validated subset Mval based on a threshold τ.

  3. 03

    Structured Lesson Distillation

    For any anomalous path ˆρj, A-MemGuard defines a lesson lt and appends it into the dedicated lesson memory Mles as a reusable negative example.

  4. 04

    Proactive Deliberation and Action Revision

    A-MemGuard structures the candidate plan ˆpfinal, retrieves similar lessons Lrel from Mles, and revises the final action using the defended policy π′.

KEY CONTRIBUTIONS

Key Contributions

  • 01

    Proactive defense for agent memory

    A-MemGuard is the first framework explicitly securing agent memory against context-dependent attacks and self-reinforcing error cycles using consensus-based validation and a dual-memory structure.

  • 02

    Consensus-based validation and dual-memory structure

    A-MemGuard introduces consensus-based validation over structured reasoning paths and a lesson memory that stores anomalous paths as negative lessons for future correction.

  • 03

    Extensive experiments on diverse attacks

    A-MemGuard cuts ASR-r from 100.0 to 2.13 on EHRAgent and reduces indirect attack ASR on MMLU from 0.667 to 0.256 while maintaining top benign accuracy.

RESULTS

By the Numbers

ASR-r

2.13%

-97.87 pp vs No Defense on EHRAgent GPT-4o-mini + DPR

ASR-t

6.38%

-93.62 pp vs No Defense on EHRAgent GPT-4o-mini + REALM

Benign ACC

77.3%

+6.2 pp over No Defense on ReAct-StrategyQA GPT-4o-mini + REALM

Indirect ASR

0.256

-0.411 vs No Defense on MMLU GPT-4o-mini

These metrics come from AgentPoison on ReAct-StrategyQA and EHRAgent plus indirect injection on MMLU, showing that A-MemGuard sharply lowers attack success while preserving or improving benign accuracy.

BENCHMARK

By the Numbers

These metrics come from AgentPoison on ReAct-StrategyQA and EHRAgent plus indirect injection on MMLU, showing that A-MemGuard sharply lowers attack success while preserving or improving benign accuracy.

BENCHMARK

Defensive performance against AgentPoison on EHRAgent (GPT-4o-mini + DPR, ASR-r)

Attack Success Rate at retrieval (ASR-r) on EHRAgent under AgentPoison.

BENCHMARK

Indirect memory injection on MMLU (average ASR)

Average Attack Success Rate under indirect memory injection on MMLU.

KEY INSIGHT

The Counterintuitive Finding

A-MemGuard reduces EHRAgent ASR-r from 100.0 to 2.13 while still achieving benign accuracy up to 77.3% on ReAct-StrategyQA.

This is surprising because stronger defenses often over-filter useful memories, yet A-MemGuard both hardens security and improves task accuracy in several settings.

WHY IT MATTERS

What this unlocks for the field

A-MemGuard enables LLM agents to treat memory as a self-checking, self-correcting component that learns from its own failures over time.

Builders can now deploy memory-augmented agents in high-stakes domains like healthcare and finance without accepting runaway error cycles from subtle memory poisoning.

~13 min read← Back to papers

Related papers

BenchmarkAgent Memory

Active Context Compression: Autonomous Memory Management in LLM Agents

Nikhil Verma

· 2026

Focus Agent adds start_focus, complete_focus, a persistent Knowledge block, and an optimized Persistent Bash plus String-Replace Editor scaffold to actively compress context during long software-engineering tasks. On five hard SWE-bench Lite instances against a Baseline ReAct agent, Focus Agent achieves 22.7% token reduction (14.9M → 11.5M) while matching 3/5 = 60% task success.

Agent Memory

ActMem: Bridging the Gap Between Memory Retrieval and Reasoning in LLM Agents

Xiaohui Zhang, Zequn Sun et al.

· 2026

ActMem transforms dialogue history into atomic facts via Memory Fact Extraction, groups them with Fact Clustering, links them through a Memory KG Construction module, and uses Counterfactual-based Retrieval and Reasoning for action-aware answers. On ActMemEval, ActMem reaches 76.52% QA accuracy with DeepSeek-V3, beating LightMem’s 63.97% by 12.55 points and NaiveRAG’s 61.54%.

Questions about this paper?

Paper: A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory

Answers use this explainer on Memory Papers.

Checking…