Experience Compression Spectrum: Unifying Memory, Skills, and Rules in LLM Agents

AuthorsXing Zhang, Guanghui Wang, Yanwei Cui et al.

2026

TL;DR

Experience Compression Spectrum reframes agent memory, skills, and rules as one compression axis, explaining 5–20× to 1,000×+ context savings and exposing the missing diagonal in current systems.

SharePost on X LinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

Agents Learn From Experience But Cross Citation Is Below 1 Percent

Experience Compression Spectrum shows that across 1,136 references in 22 primary papers, cross community citation is below 1%, indicating deep fragmentation.

Memory papers cite skill work at only 0.7% and skill papers cite memory work at 1.2%, so agents duplicate solutions and stall scalable experience management.

Without a unified view, long horizon agents either hoard low level episodic memories or over abstract skills, exhausting retrieval budgets or losing task specificity.

HOW IT WORKS

Experience Compression Spectrum — Four Levels of Agent Knowledge

Experience Compression Spectrum defines four compression levels using Level 0 Raw Trace, Level 1 Episodic Memory, Level 2 Procedural Skill, and Level 3 Declarative Rule as scaffold level knowledge outputs.

You can think of Experience Compression Spectrum like a memory hierarchy in computing, where raw traces are RAM, episodic memories are cache, skills are disk, and rules are compressed indexes.

Experience Compression Spectrum enables reasoning about when to store detailed episodes versus compact skills or abstract rules, something a plain context window cannot control or adapt across deployments.

DIAGRAM

Knowledge Flow Across Compression Levels

This diagram shows how Experience Compression Spectrum conceptualizes upward and downward knowledge movement between raw traces, episodic memories, skills, and rules.

DIAGRAM

Evaluation and Evidence Aggregation Pipeline

This diagram shows how Experience Compression Spectrum aggregates evidence from memory and skill systems to derive structural insights and testable predictions.

PROCESS

How Experience Compression Spectrum Handles Long Horizon Agent Experience

01
Interaction Trace Definition
Experience Compression Spectrum formalizes an interaction trace T as sequences of states, actions, observations, and feedback, grounding all later compression levels.
02
Experience Compression Function
Experience Compression Spectrum defines CL as a function mapping traces into knowledge artifacts at Level 0 Raw Trace, Level 1 Episodic Memory, Level 2 Procedural Skill, or Level 3 Declarative Rule.
03
Mapping Existing Systems
Experience Compression Spectrum maps more than 20 systems like Mem0, Voyager, and Trace2Skill onto specific levels, revealing clustering at Level 1 and Level 2.
04
The Missing Diagonal
Experience Compression Spectrum identifies the missing diagonal where no system adaptively selects levels or promotes and demotes knowledge across Level 1, Level 2, and Level 3.

KEY CONTRIBUTIONS

Key Contributions

01
Formalizing the Experience Compression Spectrum
Experience Compression Spectrum unifies Level 0 Raw Trace, Level 1 Episodic Memory, Level 2 Procedural Skill, and Level 3 Declarative Rule on a single compression axis with 5–20×, 50–500×, and 1,000×+ ratios.
02
Mapping 20 Plus Systems and Exposing the Missing Diagonal
Experience Compression Spectrum maps 20+ agent learning systems and shows every system fixes a compression level, with no adaptive cross level compression between memory, skills, and rules.
03
Revealing Structural Insights and Open Problems
Experience Compression Spectrum derives four structural insights, highlights <1% cross citation, and articulates open problems like adaptive level selection and principled lifecycle management.

RESULTS

By the Numbers

Cross community citation rate

below 1%

Memory papers 0.7% vs skill papers 1.2%

Episodic memory compression

5–20×

context savings over Level 0 Raw Trace

Procedural skill compression

50–500×

higher compression than Level 1 Episodic Memory

Declarative rule compression

1000×+

highest compression but lowest specificity

Experience Compression Spectrum aggregates evidence from systems like Mem0 and Trace2Skill, quantifying compression from raw traces to rules and cross community citation below 1% across 1,136 references. This proves Experience Compression Spectrum captures a real structural gap rather than a purely conceptual taxonomy.

BENCHMARK

By the Numbers

BENCHMARK

Compression Ratios Across the Experience Compression Spectrum

Approximate compression ratios for different knowledge levels in Experience Compression Spectrum.

KEY INSIGHT

The Counterintuitive Finding

Experience Compression Spectrum reports that curated Level 2 skills can add +16.2pp, while LLM self generated skills add +0.0pp on SkillsBench.

This is surprising because many assume more skills always help, but Experience Compression Spectrum shows compression quality matters more than simply having compact artifacts.

WHY IT MATTERS

What this unlocks for the field

Experience Compression Spectrum gives builders a vocabulary and framework to design agents that store memories, skills, and rules as deliberate compression choices.

With Experience Compression Spectrum, developers can plan future systems that adaptively promote or demote knowledge across levels instead of freezing agents at a single compression granularity.

~14 min read← Back to papers

Related papers

BenchmarkAgent Memory

Active Context Compression: Autonomous Memory Management in LLM Agents

Nikhil Verma

· 2026

Focus Agent adds start_focus, complete_focus, a persistent Knowledge block, and an optimized Persistent Bash plus String-Replace Editor scaffold to actively compress context during long software-engineering tasks. On five hard SWE-bench Lite instances against a Baseline ReAct agent, Focus Agent achieves 22.7% token reduction (14.9M → 11.5M) while matching 3/5 = 60% task success.

arXiv:2601.07190 Read explainer

RAGBenchmarkAgent MemoryMemory Architecture

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Xingyu Lyu, Jianfeng He et al.

· 2026

ADAM combines Anchor extraction, Distribution estimation, Anchor selection, and Query generation to adaptively probe agent memory via an auxiliary generator and entropy based selection. On the EHRAgent benchmark with Llama2-7b-chat, ADAM reaches EQ=77 and ASR=1.00, compared to MEXTRA’s EQ=44 and ASR=0.89.

arXiv:2604.09747 Read explainer

BenchmarkMemory Architecture

AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents

Shannan Yan, Jingchen Ni et al.

· 2026

AdaMem organizes dialogue history into Working Memory, Episodic Memory, Persona Memory, and Graph Memory coordinated by a Memory Agent, Research Agent, and Working Agent. On LoCoMo with GPT-4.1-mini, AdaMem achieves 44.65 F1 overall, beating the best baseline LangMem at 41.76 F1 by +2.89.

arXiv:2603.16496 Read explainer

Questions about this paper?

Answers use this explainer on Memory Papers.

Checking…