ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

AuthorsXingyu Lyu, Jianfeng He, Ning Wang et al.

2026

TL;DR

ADAM uses distribution estimation plus entropy guided querying to extract up to 83 queries and 100% ASR from agent memory, surpassing MEXTRA by 33 EQ on EHRAgent.

SharePost on XLinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

Agent memory leaks under adaptive querying with up to 100 percent ASR

Existing query based attacks on agent memory achieve low Attack Success Rates (ASR) and often extract only limited private content despite feasible vulnerabilities.

When LLM agents store sensitive user queries in long term memory modules, attackers can compromise privacy by adaptively eliciting these records through public APIs.

HOW IT WORKS

ADAM — Adaptive querying and Distribution estimation for memory extraction

ADAM’s core mechanism chains Anchor extraction, Distribution estimation, Anchor selection, and Query generation driven by an auxiliary generator Gaux and a sentence encoder z(·).

You can think of ADAM like a smart librarian using a card catalog: anchors are topic cards, distribution estimation tracks which drawers are full, and entropy guided queries open the most uncertain drawers first.

This design lets ADAM systematically explore the victim memory distribution and surface unseen records, something a plain context window or static prompt template cannot achieve.

DIAGRAM

Interactive attack loop between ADAM and an LLM agent

This diagram shows how ADAM iteratively interacts with the victim agent, updates anchors, and adapts queries over time.

DIAGRAM

Evaluation pipeline and ablation design for ADAM

This diagram shows how ADAM is evaluated across agents, LLMs, and ablations such as top k, model size, and defenses.

PROCESS

How ADAM Handles a Memory Extraction Attack Session

  1. 01

    Initialization

    ADAM initializes seed topics Sseed as anchors and assigns uniform priors ˆP0 over them before sending the first query via Gaux.

  2. 02

    Anchor extraction

    ADAM parses each agent response, uses NER and encoder z(·) to extract anchors S′anchor,t, and updates the anchor pool Tt with diverse topics.

  3. 03

    Distribution estimation

    ADAM clusters anchors with DBSCAN, computes weights wt(a), and updates selection probabilities ˆPt using λ, τ, and SelCountt−1.

  4. 04

    Anchor selection and Query generation

    ADAM runs weighted k center Anchor selection, generates candidate queries with Gaux, scores them by entropy Ht, and selects qt until early stop.

KEY CONTRIBUTIONS

Key Contributions

  • 01

    Adaptive data extraction attack

    ADAM introduces an adaptive data extraction attack that integrates Distribution estimation, Anchor selection, and Query generation to recover private records from agent memory with up to EQ=83.

  • 02

    Data distribution estimation algorithms

    ADAM is the first to emphasize memory data distribution, proposing Distribution estimation with clustering and softmax weighting, which boosts EQ beyond MEXTRA by 33 on EHRAgent.

  • 03

    Extensive evaluations and oracle attack

    ADAM is evaluated on three agents, four LLMs, four baselines, and four defenses, and introduces an oracle guided attack showing how closer distributions increase EQ.

RESULTS

By the Numbers

EQ

77

+33 over MEXTRA on EHRAgent with Llama2-7b-chat

EE

0.85

+0.36 over MEXTRA on EHRAgent with Llama2-7b-chat

CER

0.93

+0.55 over MEXTRA on EHRAgent with Llama2-7b-chat

ASR

1.00

+0.11 over MEXTRA on EHRAgent with Llama2-7b-chat

On EHRAgent with Llama2-7b-chat and memory size 300, ADAM is compared against Vanilla, RAG Thief, Pirate, and MEXTRA using EQ, EE, CER, and ASR. These results show that ADAM extracts more unique queries per budget and reaches complete extraction in most rounds.

BENCHMARK

By the Numbers

On EHRAgent with Llama2-7b-chat and memory size 300, ADAM is compared against Vanilla, RAG Thief, Pirate, and MEXTRA using EQ, EE, CER, and ASR. These results show that ADAM extracts more unique queries per budget and reaches complete extraction in most rounds.

BENCHMARK

Attack results on three real world agents

EQ on EHRAgent with Llama2-7b-chat for different attacks.

KEY INSIGHT

The Counterintuitive Finding

ADAM reaches ASR=1.00 and CER up to 0.97 while incurring only about $0.0026 per query on average across experiments.

This is surprising because stronger extraction with entropy guided querying might be expected to require many more expensive calls, yet ADAM shows high leakage with low query budgets.

WHY IT MATTERS

What this unlocks for the field

ADAM demonstrates that modeling memory topic distributions and using entropy guided queries can systematically map and drain private agent memories under black box access.

Builders can now design and test defenses against realistic, adaptive attackers like ADAM, rather than only static prompt injections, leading to more robust privacy preserving agent architectures.

~13 min read← Back to papers

Related papers

RAG

A Dynamic Retrieval-Augmented Generation System with Selective Memory and Remembrance

Okan Bursa

· 2026

Adaptive RAG Memory (ARM) augments a standard retriever–generator stack with a Dynamic Embedding Layer and Remembrance Engine that track usage statistics and apply selective remembrance and decay to embeddings. On a lightweight retrieval benchmark, ARM achieves NDCG@5 ≈ 0.9401 and Recall@5 = 1.000 with 22M parameters, matching larger baselines like gte-small while providing the best efficiency among ultra-efficient models.

RAGLong-Term Memory

HingeMem: Boundary Guided Long-Term Memory with Query Adaptive Retrieval for Scalable Dialogues

Yijie Zhong, Yunfan Gao, Haofen Wang

· 2026

HingeMem combines Boundary Guided Long-Term Memory, Dialogue Boundary Extraction, Memory Construction, Query Adaptive Retrieval, Hyperedge Rerank, and Adaptive Stop to segment dialogues into element-indexed hyperedges and plan query-specific retrieval. On LOCOMO, HingeMem achieves 63.9 overall F1 and 75.1 LLM-as-a-Judge score, surpassing the best baseline Zep (56.9 F1) by 7.0 F1 without using category-specific QA formats.

Questions about this paper?

Paper: ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Answers use this explainer on Memory Papers.

Checking…