Auxiliary-predicted Compress Memory Model(ApCM Model): A Neural Memory Storage Model Based on Invertible Compression and Learnable Prediction

AuthorsWeinuo Ou

2026

TL;DR

Auxiliary-predicted Compress Memory Model (ApCM Model) uses invertible compression plus an auxiliary predictor to match PCA-level compression while modeling nonlinear data with lower reconstruction error.

SharePost on X LinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

LLMs lack runtime memory for dynamic personalized interaction

Auxiliary-predicted Compress Memory Model (ApCM Model) targets systems where current artificial intelligence systems such as Large Language Models generally lack effective runtime memory mechanisms.

Without runtime memory, Large Language Models struggle with long-context understanding and dynamic knowledge updates, limiting personalized interaction and continual learning.

HOW IT WORKS

Invertible Dimensionality Reduction and Predictor (IDRP)

Auxiliary-predicted Compress Memory Model (ApCM Model) centers on Invertible Dimensionality Reduction and Predictor (IDRP), combining an invertible network encoder, latent space decomposition, and a learnable auxiliary predictor gϕ. IDRP feeds compressed codes into a global Memory Bank managed by a Memory Read-Write Controller.

You can think of ApCM Model like a brain-inspired RAM plus disk system, where invertible coupling layers act as a reversible compressor and the auxiliary predictor fills in missing details like a smart cache.

This key design lets ApCM Model store only zcomp while reconstructing via predicted zaux, enabling lossy yet trainable memory beyond what a plain context window or static PCA compression can provide.

DIAGRAM

Memory Read and Write Pipeline in ApCM Model

This diagram shows how Auxiliary-predicted Compress Memory Model (ApCM Model) encodes queries, retrieves from the Memory Bank, and applies the write policy based on access frequency.

DIAGRAM

Training and Evaluation Setup for ApCM Model

This diagram shows how Auxiliary-predicted Compress Memory Model (ApCM Model) is trained on synthetic or real data and evaluated against PCA and Key-Value Memory Network baselines.

PROCESS

How Auxiliary-predicted Compress Memory Model (ApCM Model) Handles a Memory Interaction Session

01
Invertible Network Encoder
Auxiliary-predicted Compress Memory Model (ApCM Model) uses the Invertible Network Encoder with stacked affine coupling layers and permutation layers to map input x to latent z. This guarantees exact invertibility so later reconstruction via IDRP remains lossless in the transform space.
02
Latent Space Decomposition and Prediction
ApCM Model splits z into zcomp and zaux, then applies the auxiliary predictor gϕ with a Linear SwiGLU Linear MLP to estimate zaux from zcomp. This forces zcomp to retain enough information for accurate auxiliary prediction.
03
Memory Read Mechanism
ApCM Model encodes a query into zcomp, computes cosine similarity with each Memory Bank slot, and selects the highest similarity slot. The selected zcomp is then passed through IDRP to reconstruct xmem and update access frequency.
04
Write Mechanism
ApCM Model encodes batch inputs into zcomp vectors, averages them into z̄, and writes into the Memory Bank using a first idle then least frequently used policy. The selected slot is overwritten with z̄ and its access count reset.

KEY CONTRIBUTIONS

Key Contributions

01
ApCM Model architecture
Auxiliary-predicted Compress Memory Model (ApCM Model) introduces the Invertible Dimensionality Reduction and Predictor (IDRP) that splits z into zcomp and zaux and learns gϕ to predict zaux. This optimizable lossy reconstruction paradigm is trained end to end using reconstruction loss.
02
Memory Read Write Controller
ApCM Model designs a Memory Read-Write Controller with a global Memory Bank, cosine similarity based read mechanism, and access frequency based write policy. This supports content based retrieval and frequency based updates for runtime memory.
03
Comparison with Key Value Memory Network
ApCM Model compresses storage dimension from 1024 to 128 while achieving test MSE 0.987171 versus 1.001440 and MAE 0.765991 versus 0.798507 on random data. This demonstrates higher storage efficiency with slightly better reconstruction than the Key-Value Memory Network.

RESULTS

By the Numbers

Test MSE Error

0.987171

-0.014269 over Key-Value Memory

Test MAE Error

0.765991

-0.032516 vs Key-Value Memory

Storage Dimension

128

compression from 1024 dimensions

Inference Time (seconds)

0.1800

0.1790 slower than Key-Value Memory

The random data comparison table evaluates Auxiliary-predicted Compress Memory Model (ApCM Model) against a Key-Value Memory Network on reconstruction error, storage size, and inference time. These results show ApCM Model trades 0.1790 seconds extra latency for 8x compression and small error reductions in MSE and MAE.

BENCHMARK

By the Numbers

BENCHMARK

Comparison between ApCM Model and Key-Value Memory Network

Test MSE Error on random data for compressed versus full memory storage.

KEY INSIGHT

The Counterintuitive Finding

Auxiliary-predicted Compress Memory Model (ApCM Model) achieves test MSE 0.987171 with storage dimension 128, while the Key-Value Memory Network has MSE 1.001440 with dimension 1024.

It is surprising that ApCM Model compresses memory by 8x yet still reduces error by 0.014269, challenging the assumption that aggressive compression must worsen reconstruction.

WHY IT MATTERS

What this unlocks for the field

Auxiliary-predicted Compress Memory Model (ApCM Model) enables learnable runtime memory that stores only zcomp yet reconstructs high fidelity data via predicted zaux and invertible transforms.

Builders can now attach a compact, trainable Memory Bank to systems like LLMs, enabling dynamic long term memory and nonlinear compression without bloating context windows.

~10 min read← Back to papers

Related papers

Memory Architecture

A Control Architecture for Training-Free Memory Use

Yanzhen Lu, Muchen Jiang et al.

· 2026

TAG routes low-confidence steps to uncertainty-based routing, filters them with guarded acceptance with rollback, chooses between bank selection across rule and exemplar memory, and prunes via evidence-based retirement inside a unified control loop. On SVAMP and ASDiv, TAG reaches 81.0% and 85.2% accuracy, improving over the 74.0% and 77.5% no-memory baselines while a compute-matched Retry baseline stays flat.

arXiv:2604.18206 Read explainer

RAGBenchmarkAgent MemoryMemory Architecture

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Xingyu Lyu, Jianfeng He et al.

· 2026

ADAM combines Anchor extraction, Distribution estimation, Anchor selection, and Query generation to adaptively probe agent memory via an auxiliary generator and entropy based selection. On the EHRAgent benchmark with Llama2-7b-chat, ADAM reaches EQ=77 and ASR=1.00, compared to MEXTRA’s EQ=44 and ASR=0.89.

arXiv:2604.09747 Read explainer

BenchmarkMemory Architecture

AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents

Shannan Yan, Jingchen Ni et al.

· 2026

AdaMem organizes dialogue history into Working Memory, Episodic Memory, Persona Memory, and Graph Memory coordinated by a Memory Agent, Research Agent, and Working Agent. On LoCoMo with GPT-4.1-mini, AdaMem achieves 44.65 F1 overall, beating the best baseline LangMem at 41.76 F1 by +2.89.

arXiv:2603.16496 Read explainer

Questions about this paper?

Answers use this explainer on Memory Papers.

Checking…