Large Associative Memory Problem in Neurobiology and Machine Learning

AuthorsDmitry Krotov, John Hopfield

arXiv 20202020

TL;DR

Large Associative Memory Problem uses a bipartite feature–memory network with only two‑body synapses to recover Dense and modern Hopfield networks with exponential storage capacity.

SharePost on X LinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

Classical Hopfield networks cap memory at about 0.14 Nf patterns

Classical associative memory with Nf feature neurons can store less than approximately 0.14 Nf memories, far below many real tasks.

Large Associative Memory Problem targets settings where required memories, like thousands of images or immune sequences, vastly exceed this linear capacity, causing failures in recall and pattern completion.

HOW IT WORKS

Large Associative Memory Problem — bipartite dynamics and energy

Large Associative Memory Problem introduces coupled feature neurons, memory neurons, an energy function, and Lagrangian functions so dynamics minimize energy with only two body synapses.

You can think of feature neurons as a cortical feature space and memory neurons as a hippocampal index, with the energy function acting like a physical potential guiding the system to stored patterns.

This construction lets Large Associative Memory Problem reproduce Dense and modern Hopfield networks, achieving superlinear or exponential capacity without explicit many body synaptic tensors.

DIAGRAM

Effective feature neuron dynamics after integrating out memory neurons

This diagram shows how Large Associative Memory Problem eliminates hidden neurons to yield Dense Associative Memory, modern Hopfield, and spherical memory limits.

DIAGRAM

Energy minimization and convergence in Large Associative Memory Problem

This diagram shows how Large Associative Memory Problem uses Lagrangian Hessians and time constants to guarantee monotonic energy decrease.

PROCESS

How Large Associative Memory Problem Handles an Associative Recall

01
Temporal evolution of two groups of neurons
Large Associative Memory Problem initializes feature neurons vi and memory neurons hµ, then evolves them via coupled differential equations with time constants τf and τh.
02
Energy function for the network
Large Associative Memory Problem computes the energy function using Lagrangian functions Lv and Lh plus the interaction term fµ ξµi gi at each time.
03
Model A Dense Associative Memory limit
In the τh → 0 and additive Lagrangian regime, Large Associative Memory Problem integrates out hµ to recover the Dense Associative Memory energy with F(x) choices.
04
Model B modern Hopfield networks limit
With contrastive normalization Lh = log Σµ e^{hµ}, Large Associative Memory Problem yields softmax based feature updates equivalent to dot product attention.

KEY CONTRIBUTIONS

Key Contributions

01
General dynamical system and energy function
Large Associative Memory Problem defines coupled feature neurons, memory neurons, and an energy function with Lagrangian functions ensuring dE/dt ≤ 0 under broad conditions.
02
Dense Associative Memory and modern Hopfield limits
Large Associative Memory Problem shows that Model A reproduces Dense Associative Memories with N_mem ∼ min(N_f^{n−1}, N_h), while Model B recovers modern Hopfield networks and attention.
03
Spherical Memory model
Large Associative Memory Problem introduces Model C, the Spherical Memory with divisive normalization gi = vi / sqrt(Σj vj^2), a new symmetric associative memory family.

RESULTS

By the Numbers

Classical capacity bound

0.14 Nf memories

baseline Hopfield 0.14 Nf vs superlinear or exponential in Large Associative Memory Problem

Dense Memory capacity

Nf^{n−1} memories

vs classical 0.14 Nf using power F(x) = x^n

Exponential capacity case

exp(Nf) scale

using F(x) = exp(x) as in Demircigil et al. 2017

Hidden neuron bound

Nmem ≤ Nh

capacity also limited by number of memory neurons

Large Associative Memory Problem analyzes storage capacity using theoretical bounds rather than benchmarks, showing that appropriate Lagrangian choices yield N_mem ∼ min(N_f^{n−1}, N_h) or exponential in N_f, compared to the classical Hopfield limit of less than 0.14 N_f memories.

BENCHMARK

Storage capacity scaling across associative memory models

Asymptotic number of storable memories as a function of feature dimension Nf.

KEY INSIGHT

The Counterintuitive Finding

Large Associative Memory Problem shows that Dense Associative Memories can store N_mem ∼ N_f^{n−1} or even exponential in N_f patterns using only two body synapses.

This challenges the intuition that biological plausibility, specifically forbidding many body synapses, necessarily forces associative memories to have only O(N_f) storage capacity.

WHY IT MATTERS

What this unlocks for the field

Large Associative Memory Problem provides a unifying, energy based view of Dense, modern, and spherical Hopfield networks with explicit hidden circuitry.

Builders can now design recurrent architectures with attention like updates and huge memory capacity while maintaining a clear mapping to biologically plausible two body synaptic structures.

~14 min read← Back to papers

Related papers

Memory Architecture

A Control Architecture for Training-Free Memory Use

Yanzhen Lu, Muchen Jiang et al.

· 2026

TAG routes low-confidence steps to uncertainty-based routing, filters them with guarded acceptance with rollback, chooses between bank selection across rule and exemplar memory, and prunes via evidence-based retirement inside a unified control loop. On SVAMP and ASDiv, TAG reaches 81.0% and 85.2% accuracy, improving over the 74.0% and 77.5% no-memory baselines while a compute-matched Retry baseline stays flat.

arXiv:2604.18206 Read explainer

RAGBenchmarkAgent MemoryMemory Architecture

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Xingyu Lyu, Jianfeng He et al.

· 2026

ADAM combines Anchor extraction, Distribution estimation, Anchor selection, and Query generation to adaptively probe agent memory via an auxiliary generator and entropy based selection. On the EHRAgent benchmark with Llama2-7b-chat, ADAM reaches EQ=77 and ASR=1.00, compared to MEXTRA’s EQ=44 and ASR=0.89.

arXiv:2604.09747 Read explainer

BenchmarkMemory Architecture

AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents

Shannan Yan, Jingchen Ni et al.

· 2026

AdaMem organizes dialogue history into Working Memory, Episodic Memory, Persona Memory, and Graph Memory coordinated by a Memory Agent, Research Agent, and Working Agent. On LoCoMo with GPT-4.1-mini, AdaMem achieves 44.65 F1 overall, beating the best baseline LangMem at 41.76 F1 by +2.89.

arXiv:2603.16496 Read explainer

Questions about this paper?

Answers use this explainer on Memory Papers.

Checking…