Progressive Memory Banks for Incremental Domain Adaptation

AuthorsNabiha Asghar, Lili Mou, Kira A. Selby et al.

arXiv 20182018

TL;DR

Progressive Memory Banks for Incremental Domain Adaptation augments RNNs with progressively expanded key value memory banks, yielding 67.55% vs 65.62% accuracy on the Fic source domain when adapting Fic→Gov.

SharePost on X LinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

Incremental adaptation forgets previous domains in MultiNLI

Fine tuning a BiLSTM on MultiNLI Gov after Fic yields 65.62% on Fic vs 66.02% for multi task training, showing catastrophic forgetting.

When domain labels are unavailable at test time, a single fine tuned network for incremental domain adaptation degrades on earlier domains, harming robustness in changing environments.

HOW IT WORKS

Progressive memory banks for incremental domain adaptation

Progressive Memory Banks for Incremental Domain Adaptation augments a BiLSTM RNN with a directly parameterized memory bank, key value memory, and an attention based memory mechanism that feeds retrieved content into RNN transitions.

You can think of Progressive Memory Banks for Incremental Domain Adaptation as adding a pluggable RAM module next to the recurrent core, instead of rewiring the CPU registers every time a new domain appears.

By progressively expanding memory slots instead of hidden states, Progressive Memory Banks for Incremental Domain Adaptation enables learning new domains with less change to existing RNN representations than a plain context window or naive fine tuning.

DIAGRAM

Incremental memory expansion and fine tuning flow

This diagram shows how Progressive Memory Banks for Incremental Domain Adaptation expands memory slots and fine tunes all parameters when a new domain arrives, following Algorithm 1.

DIAGRAM

Evaluation setup for MultiNLI incremental domains

This diagram shows how Progressive Memory Banks for Incremental Domain Adaptation is evaluated across Fic, Gov, Slate, Tel, and Travel domains in two domain and multi domain settings.

PROCESS

How Progressive Memory Banks for Incremental Domain Adaptation Handles MultiNLI Domains

01
Augmenting RNN with memory banks
Progressive Memory Banks for Incremental Domain Adaptation attaches a key value memory bank and memory mechanism to the BiLSTM RNN, computing attention over memory keys each timestep.
02
Progressively increasing memory for incremental domain adaptation
When a new domain arrives, Progressive Memory Banks for Incremental Domain Adaptation adds new memory slots, recomputes attention over N+M slots, and feeds updated memory content into the RNN.
03
Fine tuning while increasing memory slots
Progressive Memory Banks for Incremental Domain Adaptation loads previous RNN weights and existing memory banks, then fine tunes all parameters jointly on the current domain data.
04
IDA with all domains
Progressive Memory Banks for Incremental Domain Adaptation repeats expansion and fine tuning across Fic, Gov, Slate, Tel, and Travel, yielding a single unified classifier evaluated on all domains.

KEY CONTRIBUTIONS

Key Contributions

01
Progressive memory bank for IDA
Progressive Memory Banks for Incremental Domain Adaptation introduces a directly parameterized memory bank with key value memory and attention based memory mechanism for incremental domain adaptation in RNNs.
02
Theoretical comparison of expansion strategies
Progressive Memory Banks for Incremental Domain Adaptation proves Theorem 1, showing memory expansion yields lower expected mean squared change in hidden states than expanding RNN hidden states under reasonable attention assumptions.
03
Empirical MultiNLI IDA results
On Fic→Gov, Progressive Memory Banks for Incremental Domain Adaptation with memory and vocabulary expansion reaches 67.55% on Fic and 70.82% on Gov, surpassing fine tuning and elastic weight consolidation baselines.

RESULTS

By the Numbers

% Accuracy on S

67.55%

+1.93 over S→T (F)

% Accuracy on T

70.82%

+0.92 over S→T (F)

% Accuracy on S

67.55%

+1.53 over S→T (EWC)

% Accuracy on T

70.82%

+2.57 over S→T (Progressive)

These numbers come from the MultiNLI Fic→Gov two domain adaptation experiment, which tests incremental domain adaptation. The gains show Progressive Memory Banks for Incremental Domain Adaptation preserves source performance while improving target accuracy compared to fine tuning, elastic weight consolidation, and progressive neural networks.

BENCHMARK

By the Numbers

BENCHMARK

Results on two domain adaptation Fic→Gov

% Accuracy on source Fic domain after adapting Fic→Gov.

KEY INSIGHT

The Counterintuitive Finding

Despite fine tuning all parameters, Progressive Memory Banks for Incremental Domain Adaptation with memory expansion improves Fic accuracy from 65.62% to 67.55% when adapting to Gov.

This is counterintuitive because expanding capacity via memory slots, not hidden states, reduces hidden state drift, contradicting the assumption that any full network fine tuning must catastrophically overwrite old knowledge.

WHY IT MATTERS

What this unlocks for the field

Progressive Memory Banks for Incremental Domain Adaptation enables RNN based NLP systems to accumulate domain knowledge over time without storing old data or training separate predictors.

Builders can now deploy a single BiLSTM with progressive memory banks that adapts sequentially across domains like Fic, Gov, Slate, Tel, and Travel while maintaining strong performance on all.

~12 min read← Back to papers

Related papers

Memory Architecture

A Control Architecture for Training-Free Memory Use

Yanzhen Lu, Muchen Jiang et al.

· 2026

TAG routes low-confidence steps to uncertainty-based routing, filters them with guarded acceptance with rollback, chooses between bank selection across rule and exemplar memory, and prunes via evidence-based retirement inside a unified control loop. On SVAMP and ASDiv, TAG reaches 81.0% and 85.2% accuracy, improving over the 74.0% and 77.5% no-memory baselines while a compute-matched Retry baseline stays flat.

arXiv:2604.18206 Read explainer

RAGBenchmarkAgent MemoryMemory Architecture

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Xingyu Lyu, Jianfeng He et al.

· 2026

ADAM combines Anchor extraction, Distribution estimation, Anchor selection, and Query generation to adaptively probe agent memory via an auxiliary generator and entropy based selection. On the EHRAgent benchmark with Llama2-7b-chat, ADAM reaches EQ=77 and ASR=1.00, compared to MEXTRA’s EQ=44 and ASR=0.89.

arXiv:2604.09747 Read explainer

BenchmarkMemory Architecture

AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents

Shannan Yan, Jingchen Ni et al.

· 2026

AdaMem organizes dialogue history into Working Memory, Episodic Memory, Persona Memory, and Graph Memory coordinated by a Memory Agent, Research Agent, and Working Agent. On LoCoMo with GPT-4.1-mini, AdaMem achieves 44.65 F1 overall, beating the best baseline LangMem at 41.76 F1 by +2.89.

arXiv:2603.16496 Read explainer

Questions about this paper?

Answers use this explainer on Memory Papers.

Checking…