Forgetful but Faithful: A Cognitive Memory Architecture and Benchmark for Privacy-Aware Generative Agents

AuthorsSaad Alqithami

2025

TL;DR

MaRS couples typed, provenance-aware memory with hybrid forgetting policies so Hybrid retention reaches ≈0.911 composite FiFA score under tight budgets.

SharePost on XLinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

Unbounded agent memory hurts coherence and privacy (Hybrid policy ≈0.911 composite shows the tradeoff)

Generative agents accumulate long interaction histories, and naive retention inflates context, harms retrieval, and increases privacy risk despite Hybrid reaching ≈0.911 composite FiFA score.

Without structured forgetting, MaRS style agents either overspend tokens or lose narrative coherence and social recall, undermining long-horizon goals and privacy expectations.

HOW IT WORKS

Memory-Aware Retention Schema MaRS

MaRS introduces episodic, semantic, social, and task memory types plus a privacy engine and forgetting policies like FIFO, LRU, Priority Decay, Reflection-Summary, Random-Drop, and Hybrid.

Think of MaRS as a governed RAM plus disk for agents, where typed memories are files, indices are the card catalog, and forgetting policies are an OS eviction scheduler.

This design lets MaRS keep high-utility, low-sensitivity memories while compressing or deleting others, something a plain context window cannot express or audit.

DIAGRAM

FiFA Interaction and Memory Flow

This diagram shows how MaRS handles a FiFA episode from user interaction through memory storage, retrieval, and response generation.

DIAGRAM

FiFA Evaluation Pipeline for MaRS

This diagram shows how the FiFA benchmark runs 300 simulations across five budgets to compare MaRS forgetting policies.

PROCESS

How MaRS Handles a FiFA Simulation Episode

  1. 01

    Memory-Aware Retention Schema MaRS

    MaRS initializes typed episodic, semantic, social, and task stores plus indices and a privacy engine before FiFA simulations begin.

  2. 02

    Forgetting Policy Framework

    MaRS activates FIFO, LRU, Priority Decay, Reflection-Summary, Random-Drop, or Hybrid policies when token budgets are exceeded during interaction.

  3. 03

    Privacy-Aware Policies

    MaRS applies sensitivity scores and optional (ε, δ)-differential privacy at the retention boundary to govern which memories are summarized or deleted.

  4. 04

    FiFA Benchmark Evaluation

    MaRS runs 300 FiFA simulations across five memory budgets, logging narrative coherence, goal completion, social recall, privacy leakage, and cost efficiency.

KEY CONTRIBUTIONS

Key Contributions

  • 01

    Memory-Aware Retention Schema MaRS

    MaRS defines a typed, provenance-aware memory graph with episodic, semantic, social, and task nodes plus indices and budgets, turning retention into a policy-addressable decision surface.

  • 02

    Forgetting Policy Framework

    MaRS formalizes six forgetting policies—FIFO, LRU, Priority Decay, Reflection-Summary, Random-Drop, and Hybrid—with complexity analysis and sensitivity-aware retention under explicit budgets.

  • 03

    Forgetful but Faithful Agent FiFA

    MaRS is evaluated on the FiFA benchmark, where the Hybrid policy achieves a composite score of ≈0.911 across 300 runs and five memory budgets.

RESULTS

By the Numbers

Composite FiFA score

0.911

Hybrid vs simpler policies (exact baselines not numerically specified)

Simulation runs

300

across five memory budgets in FiFA

Memory budgets

5

spanning low to high token limits

Forgetting policies

6

FIFO, LRU, Priority Decay, Reflection-Summary, Random-Drop, Hybrid

FiFA is a multi-agent simulation benchmark measuring narrative coherence, goal completion, social recall, privacy leakage, and cost efficiency under explicit token budgets. The ≈0.911 composite FiFA score shows that Hybrid retention in MaRS can balance coherence, efficiency, and privacy under constrained memory.

BENCHMARK

By the Numbers

FiFA is a multi-agent simulation benchmark measuring narrative coherence, goal completion, social recall, privacy leakage, and cost efficiency under explicit token budgets. The ≈0.911 composite FiFA score shows that Hybrid retention in MaRS can balance coherence, efficiency, and privacy under constrained memory.

BENCHMARK

FiFA Composite Performance Across Policies

Composite FiFA score combining narrative coherence, goal completion, social recall, privacy, and cost efficiency.

KEY INSIGHT

The Counterintuitive Finding

Hybrid forgetting in MaRS achieves a composite FiFA score of ≈0.911 while still enforcing strict memory budgets and privacy-aware retention.

This is surprising because many assume aggressive forgetting inevitably harms coherence, yet MaRS shows principled forgetting-by-design can improve both quality and governance.

WHY IT MATTERS

What this unlocks for the field

MaRS enables agents to treat memory as a governed resource, balancing episodic detail, semantic consolidation, social recall, and task context under explicit budgets.

Builders can now design agents that remain coherent over days, respect privacy norms, and keep costs tractable without relying on ever-growing context windows.

~14 min read← Back to papers

Related papers

BenchmarkAgent Memory

Active Context Compression: Autonomous Memory Management in LLM Agents

Nikhil Verma

· 2026

Focus Agent adds start_focus, complete_focus, a persistent Knowledge block, and an optimized Persistent Bash plus String-Replace Editor scaffold to actively compress context during long software-engineering tasks. On five hard SWE-bench Lite instances against a Baseline ReAct agent, Focus Agent achieves 22.7% token reduction (14.9M → 11.5M) while matching 3/5 = 60% task success.

Questions about this paper?

Paper: Forgetful but Faithful: A Cognitive Memory Architecture and Benchmark for Privacy-Aware Generative Agents

Answers use this explainer on Memory Papers.

Checking…