Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory

AuthorsShuning Zhang, Rongjun Ma, Ying Ma et al.

2025

TL;DR

Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory shows that users want lifecycle-wide, granular control and transparency over RAG memories instead of opaque, automatic logging.

SharePost on X LinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

Persistent RAG memories create opaque privacy risks for everyday users

Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory shows that users' mental models of memory are diverse and often incomplete, especially about privacy risks and data persistence.

When RAG-based memory silently aggregates sensitive dialogues across sessions, users cannot see what is stored, how it is used, or whether deletion is effective, undermining control and trust.

HOW IT WORKS

Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory — mental models, privacy calculus, and lifecycle controls

Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory uses semi structured interviews and thematic analysis to map mental models, privacy calculus, and expectations across memory generation, management, usage, and updating.

You can think of Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory like debugging a black box cache: it reverse engineers how users *believe* the cache works versus how it actually behaves.

This lets Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory specify UI and architectural hooks that a plain context window cannot offer, such as workspace level consent, per memory editing, and control over opaque inferences.

DIAGRAM

User–LLM interaction around RAG memory and privacy calculus

This diagram shows how users negotiate benefits and risks when deciding what to let RAG-based memory store across a conversation.

DIAGRAM

Lifecycle of RAG-based memory in Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory

This diagram shows how Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory structures user expectations across memory generation, management, usage, and updating.

PROCESS

How Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory Handles the Memory Lifecycle

01
Memory generation
Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory examines how users expect explicit consent when memories are generated from dialogues, including control over what is extracted and stored.
02
Memory management
Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory analyzes expectations for interfaces to review, edit, delete, categorize, and temporally manage stored memories.
03
Memory usage
Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory studies how users want to scope which memories are applied to which tasks, including incognito modes and persona specific workspaces.
04
Memory updating
Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory surfaces needs for conflict resolution, correction mechanisms, and visibility into what gets replaced when information changes.

KEY CONTRIBUTIONS

Key Contributions

01
Characterizing mental models of RAG memory
Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory identifies four mental models, from transient dialogue buffers to training data extensions and active information processing mechanisms.
02
Analyzing privacy calculus and strategies
Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory details how users weigh personalization against risks and adopt strategic disclosure, obfuscation, and refusal or workarounds.
03
Deriving lifecycle design implications
Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory proposes architectural, interface, and interaction level designs for contextual workspaces, transparent memory views, and co curated updates.

RESULTS

By the Numbers

Participants

18 users

semi structured interviews with Chinese users

Inter rater reliability

Cohen's Kappa 0.90

codebook consistency between two researchers

Interview compensation

100 RMB

per participant payment for the study

Memory models

4 themes

transient buffer, training data, active processing, uncertainty

Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory uses 18 semi structured interviews and thematic analysis with Cohen's Kappa 0.90 to ensure coding reliability. This supports the qualitative claims about mental models, privacy calculus, and lifecycle expectations for RAG-based memory.

BENCHMARK

By the Numbers

BENCHMARK

Core sample and coding statistics from Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory

Relative magnitudes of key quantitative facts reported (participants, reliability, compensation, mental model themes).

KEY INSIGHT

The Counterintuitive Finding

Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory finds that even AI researchers like P7 often have no clear understanding of how RAG-based memory works.

This is counterintuitive because we might assume technically skilled users correctly model memory pipelines, yet they share the same opaque, folk theories as non experts.

WHY IT MATTERS

What this unlocks for the field

Understanding Users' Privacy Perceptions Towards LLM's RAG-based Memory gives builders a concrete checklist of lifecycle controls users actually want around RAG memories.

Armed with this, developers can design memory workspaces, transparent influence indicators, and inference controls that align with real user expectations instead of guesswork.

~12 min read← Back to papers

Related papers

RAGBenchmarkAgent MemoryMemory Architecture

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Xingyu Lyu, Jianfeng He et al.

· 2026

ADAM combines Anchor extraction, Distribution estimation, Anchor selection, and Query generation to adaptively probe agent memory via an auxiliary generator and entropy based selection. On the EHRAgent benchmark with Llama2-7b-chat, ADAM reaches EQ=77 and ASR=1.00, compared to MEXTRA’s EQ=44 and ASR=0.89.

arXiv:2604.09747 Read explainer

RAG

A Dynamic Retrieval-Augmented Generation System with Selective Memory and Remembrance

Okan Bursa

· 2026

Adaptive RAG Memory (ARM) augments a standard retriever–generator stack with a Dynamic Embedding Layer and Remembrance Engine that track usage statistics and apply selective remembrance and decay to embeddings. On a lightweight retrieval benchmark, ARM achieves NDCG@5 ≈ 0.9401 and Recall@5 = 1.000 with 22M parameters, matching larger baselines like gte-small while providing the best efficiency among ultra-efficient models.

arXiv:2601.02428 Read explainer

RAGLong-Term Memory

HingeMem: Boundary Guided Long-Term Memory with Query Adaptive Retrieval for Scalable Dialogues

Yijie Zhong, Yunfan Gao, Haofen Wang

· 2026

HingeMem combines Boundary Guided Long-Term Memory, Dialogue Boundary Extraction, Memory Construction, Query Adaptive Retrieval, Hyperedge Rerank, and Adaptive Stop to segment dialogues into element-indexed hyperedges and plan query-specific retrieval. On LOCOMO, HingeMem achieves 63.9 overall F1 and 75.1 LLM-as-a-Judge score, surpassing the best baseline Zep (56.9 F1) by 7.0 F1 without using category-specific QA formats.

arXiv:2604.06845 Read explainer

Questions about this paper?

Answers use this explainer on Memory Papers.

Checking…