In search of dispersed memories: Generative diffusion models are associative memory networks

AuthorsLuca Ambrogioni

2023

TL;DR

In search of dispersed memories shows that generative diffusion dynamics implement a modern Hopfield energy function, yielding associative memory behavior with Pearson correlations up to 0.996 with modern Hopfield outputs.

SharePost on XLinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

Associative memory lacks unified long term mechanism

Classical Hopfield networks have storage capacity scaling only as D/4 log2 D, limiting long term memory and associative recall in high dimensions.

Modern Hopfield networks increase capacity but require hard storage of patterns, so synaptic learning no longer explains how biological systems encode and consolidate memories.

HOW IT WORKS

In search of dispersed memories — diffusion as associative memory

In search of dispersed memories links associative memory networks, modern Hopfield networks, generative diffusion models, and the denoising loss into one energy-based framework for memory.

Think of generative diffusion models as a noisy hippocampus: forward diffusion corrupts patterns, while reverse denoising plays the role of attractor dynamics in a Hopfield-like energy landscape.

This connection lets In search of dispersed memories use diffusion score networks to implement modern Hopfield energy minimization, enabling probabilistic recall and semantic manifolds beyond any fixed context window.

DIAGRAM

Denoising and recall flow in In search of dispersed memories

This diagram shows how In search of dispersed memories runs the reverse diffusion dynamics to perform associative recall from corrupted inputs.

DIAGRAM

Evaluation pipeline for associative memory experiments

This diagram shows how In search of dispersed memories evaluates denoising, completion, and capacity for diffusion and Hopfield models.

PROCESS

How In search of dispersed memories Handles associative recall

  1. 01

    Associative memory networks

    In search of dispersed memories starts from associative memory networks, encoding patterns as fixed points via the Hopfield energy uH(x) = xT Wx and Hebbian learning.

  2. 02

    Generative diffusion models

    In search of dispersed memories defines forward noise injection x(t − dt) = x(t) + σ√dt δ(t) and reverse dynamics using the score ∇x log pt(x) estimated by a denoising loss.

  3. 03

    The equivalence between diffusion models and modern Hopfield networks

    In search of dispersed memories proves that the diffusion energy uDM(x, t) asymptotically matches the modern Hopfield energy uMH(x, β), sharing fixed points and capacity.

  4. 04

    Encoding memories by denoising neural state

    In search of dispersed memories uses the denoising loss L(W) and SGD updates Wt+1 = Wt − η ∂s/∂W (δ(t) − s) to encode associative dynamics in synaptic weights.

KEY CONTRIBUTIONS

Key Contributions

  • 01

    The equivalence between diffusion models and modern Hopfield networks

    In search of dispersed memories shows that the diffusion energy uDM(x, t) becomes identical to the modern Hopfield energy uMH(x, β(t)) as β(t) → ∞, yielding the same fixed points and capacity.

  • 02

    Encoding memories by denoising neural state

    In search of dispersed memories interprets the denoising loss L(W) = 1/2 Ey,t Ex(t)|y ∥δ(t) − s(x(t), t; W)∥2 as a synaptic learning rule that stores associative dynamics in deep network weights.

  • 03

    Beyond classical associative memories

    In search of dispersed memories generalizes associative memory to probabilistic recall and higher dimensional memory structures, modeling semantic, episodic, and reconstructive memory within one diffusion framework.

RESULTS

By the Numbers

Pearson correlation denoising 10 patterns

0.995

+0.263 over Classic Hopfield

Pearson correlation denoising 30 patterns

0.991

+0.276 over Classic Hopfield

Pearson correlation completion 10 patterns

0.996

+0.255 over Classic Hopfield

Pearson correlation completion 30 patterns

0.989

+0.289 over Classic Hopfield

In search of dispersed memories evaluates on binary denoising and completion tasks with 10–30 patterns in 10 dimensions, showing diffusion outputs almost perfectly match modern Hopfield iterations while classical Hopfield networks lag by 0.255–0.289 correlation.

BENCHMARK

By the Numbers

In search of dispersed memories evaluates on binary denoising and completion tasks with 10–30 patterns in 10 dimensions, showing diffusion outputs almost perfectly match modern Hopfield iterations while classical Hopfield networks lag by 0.255–0.289 correlation.

BENCHMARK

Pearson correlation between modern Hopfield output and other models

Correlation with modern Hopfield iterations on binary denoising (10 patterns, dimension 10).

KEY INSIGHT

The Counterintuitive Finding

In search of dispersed memories finds that exact diffusion models reach Pearson correlations up to 0.996 with modern Hopfield outputs, despite using stochastic denoising dynamics.

This is surprising because one might expect stochastic diffusion trajectories to deviate substantially from deterministic Hopfield iterations, yet their fixed points and recall behavior are almost identical.

WHY IT MATTERS

What this unlocks for the field

In search of dispersed memories shows that generative diffusion models can serve simultaneously as associative memories, probabilistic recall mechanisms, and semantic manifold learners.

This lets builders design systems where creative generation, episodic recall, and reconstructive memory all emerge from a single diffusion-based energy landscape instead of separate, hand-engineered memory modules.

~14 min read← Back to papers

Related papers

Agent MemoryLong-Term Memory

Adaptive Memory Admission Control for LLM Agents

Guilin Zhang, Wei Jiang et al.

· 2026

A-MAC scores candidate memories using Utility, Confidence, Novelty, Recency, and Type Prior combined by a learned linear admission policy with Algorithm 1 A-MAC Memory Admission. On the LoCoMo benchmark, A-MAC achieves F1 0.583 and 2644 ms latency, improving F1 by 0.042 and reducing latency by 1187 ms compared to A-mem.

Long-Term Memory

Advancing Open-source World Models

Robbyant Team, Zelin Gao et al.

arXiv 2026 · 2026

LingBot-World combines a Data Engine, Fundamental World Model, Action-Conditioned World Model, and Post-Training causal adaptation to turn a 28B-parameter video generator into a real-time interactive world simulator. On the VBench benchmark, LingBot-World achieves a dynamic degree of 0.8857 versus 0.7612 for Yume-1.5, while also improving imaging quality to 0.6683.

BenchmarkBenchmarkLong-Term Memory

AgenticAI-DialogGen: Topic-Guided Conversation Generation for Fine-Tuning and Evaluating Short- and Long-Term Memories of LLMs

Manoj Madushanka Perera, Adnan Mahmood et al.

· 2026

AgenticAI-DialogGen chains ChatPreprocessor, KnowledgeExtractor, TopicAnalyzer, KnowledgeGraphBuilder, PersonaGenerator, DuelingChat Agent, ConversationValidator, ConversationRefiner, QAGeneration, and PostProcessing to turn raw multi-session chats into topic-guided, persona-grounded conversations with explicit short- and long-term memories. On the TGC / KG memory QA benchmark, Mistral-7B fine-tuned within AgenticAI-DialogGen achieves 87.36 F1, compared to GPT-4’s 83.77 F1 in a zero-shot setting on the same task.

Questions about this paper?

Paper: In search of dispersed memories: Generative diffusion models are associative memory networks

Answers use this explainer on Memory Papers.

Checking…