Empowering Working Memory for Large Language Model Agents

AuthorsJing Guo, Nan Li, Jianchuan Qi et al.

2023

TL;DR

Empowering Working Memory for Large Language Model Agents adds a centralized Working Memory Hub plus an Episodic Buffer to unify cross-episode memory without changing LLM internals.

SharePost on XLinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

LLM agents lack episodic continuity across isolated interaction domains

Empowering Working Memory for Large Language Model Agents highlights that standard LLM agents treat each interaction as an isolated episode with no linkage between sequential dialogues.

This fragmented setup means multi-step reasoning and multi-agent systems cannot reuse prior experiences, so complex sequential reasoning and collaboration remain brittle and short-sighted.

HOW IT WORKS

Working Memory Hub and Episodic Buffer for LLM agents

Empowering Working Memory for Large Language Model Agents centers on a Working Memory Hub that connects the Central Processor, Interaction History Window, Episodic Buffer, and External Environment Interface.

You can think of the Working Memory Hub as shared RAM plus disk, while the Episodic Buffer acts like a hippocampus that stores whole episodes for later recall.

This KEY_MECHANISM lets Empowering Working Memory for Large Language Model Agents maintain persistent, queryable memories across sessions, something a plain context window with token limits cannot achieve.

DIAGRAM

Cross episode interaction and memory flow

This diagram shows how Empowering Working Memory for Large Language Model Agents routes user interactions through the Working Memory Hub to maintain history and episodic traces across sessions.

DIAGRAM

Memory access and retrieval mechanisms in multi agent systems

This diagram shows how Empowering Working Memory for Large Language Model Agents structures role based, task based, autonomous, and collaboration based memory access on top of the Memory Management Agent.

PROCESS

How Empowering Working Memory for Large Language Model Agents Handles an Interaction Session

  1. 01

    External Environment Interface

    Empowering Working Memory for Large Language Model Agents uses the External Environment Interface to ingest real time user inputs and external data before passing them onward.

  2. 02

    Working Memory Hub

    The Working Memory Hub in Empowering Working Memory for Large Language Model Agents stores all inputs and outputs, creating a persistent record that other components can query.

  3. 03

    Interaction History Window

    The Interaction History Window reads from the Working Memory Hub to maintain a short term cache, such as rolling windows or abstractive summaries, for immediate context.

  4. 04

    Episodic Buffer

    The Episodic Buffer in Empowering Working Memory for Large Language Model Agents retrieves complete episodes from the Working Memory Hub to support long term episodic recall during new tasks.

KEY CONTRIBUTIONS

Key Contributions

  • 01

    An advanced working memory model for LLM agents

    Empowering Working Memory for Large Language Model Agents proposes a centralized Working Memory Hub plus Episodic Buffer, Interaction History Window, Central Processor, and External Environment Interface to mirror cognitive psychology models.

  • 02

    Technical pathways for a memory hub

    Empowering Working Memory for Large Language Model Agents outlines how third party databases and platforms like Postgres, Elasticsearch, Picone, and Xata can implement the Working Memory Hub.

  • 03

    Memory access mechanisms in multi agent systems

    Empowering Working Memory for Large Language Model Agents details role based, task based, autonomous, collaboration scenario based access, plus a Memory Management Agent for efficient episodic buffer usage.

RESULTS

By the Numbers

Quantitative benchmarks

N/A

No empirical results or baselines are reported for Empowering Working Memory for Large Language Model Agents

Memory components

4 components

Central Processor, Interaction History Window, Working Memory Hub, Episodic Buffer

Memory access strategies

5 strategies

Role based, task based, autonomous, collaboration scenario based, Memory Management Agent

Search mechanisms

3 methods

SQL search, full text search, semantic search

Empowering Working Memory for Large Language Model Agents is a conceptual and architectural proposal without benchmark datasets, focusing instead on defining components and mechanisms for LLM working memory.

BENCHMARK

By the Numbers

Empowering Working Memory for Large Language Model Agents is a conceptual and architectural proposal without benchmark datasets, focusing instead on defining components and mechanisms for LLM working memory.

BENCHMARK

Comparison of memory related mechanisms described in Empowering Working Memory for Large Language Model Agents

Count of distinct mechanisms or components defined for memory architecture and retrieval.

KEY INSIGHT

The Counterintuitive Finding

Empowering Working Memory for Large Language Model Agents argues that simply extending context windows, like RecurrentGPT or long term memory schemes, does not change the working memory model itself.

This is counterintuitive because many practitioners assume larger token limits alone solve memory, but Empowering Working Memory for Large Language Model Agents shows architecture and episodic structure matter more.

WHY IT MATTERS

What this unlocks for the field

Empowering Working Memory for Large Language Model Agents unlocks a path to LLM agents that maintain persistent, structured episodic memories across tasks and multi agent collaborations.

Builders can now design agents around a Working Memory Hub and Episodic Buffer, integrating databases and access policies instead of relying solely on prompt engineering and raw context windows.

~12 min read← Back to papers

Related papers

BenchmarkAgent Memory

Active Context Compression: Autonomous Memory Management in LLM Agents

Nikhil Verma

· 2026

Focus Agent adds start_focus, complete_focus, a persistent Knowledge block, and an optimized Persistent Bash plus String-Replace Editor scaffold to actively compress context during long software-engineering tasks. On five hard SWE-bench Lite instances against a Baseline ReAct agent, Focus Agent achieves 22.7% token reduction (14.9M → 11.5M) while matching 3/5 = 60% task success.

Questions about this paper?

Paper: Empowering Working Memory for Large Language Model Agents

Answers use this explainer on Memory Papers.

Checking…