ByteRover: Agent-Native Memory Through LLM-Curated Hierarchical Context

AuthorsAndy Nguyen, Danh Doan, Hoang Pham et al.

2026

TL;DR

ByteRover uses an LLM-curated hierarchical Context Tree with a 5-tier retrieval pipeline to reach 96.1% accuracy on LoCoMo, +6.2 points over HonCho.

SharePost on X LinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

External memory services cause semantic drift and fragile recovery

Existing Memory-Augmented Generation systems treat memory as an external service, so the system that stores knowledge does not understand it.

This separation leads to semantic drift, lost coordination context, and recovery fragility, where autonomous agents misremember tasks and cannot reliably reconstruct long-horizon state.

HOW IT WORKS

ByteRover: Agent-native Context Tree with Adaptive Lifecycle

ByteRover centers on an Agent Layer, Execution Layer, and Context Tree connected to MiniSearch and a query cache on the local filesystem.

You can think of ByteRover like a programmer who both writes and organizes code, instead of dumping snippets into a blind search index.

This design lets ByteRover maintain structured, evolving memory with explicit relations and lifecycle metadata that a plain context window or vector store cannot provide.

DIAGRAM

Five-tier retrieval pipeline in ByteRover

This diagram shows how ByteRover routes a query through cache checks, MiniSearch, and escalating LLM reasoning tiers.

DIAGRAM

Evaluation setup for ByteRover on LoCoMo and LongMemEval S

This diagram shows how ByteRover processes benchmark conversations and is scored with LLM-as-a-Judge against baselines.

PROCESS

How ByteRover Handles a Query with 5-Tier Progressive Retrieval

01
5-Tier Progressive Retrieval
ByteRover receives a query and computes a fingerprint while checking the query cache and fuzzy cache before touching the Context Tree or MiniSearch.
02
MiniSearch full text search engine
If caches miss, ByteRover uses MiniSearch with BM25, fuzzy matching, and field boosts to rank Context Tree entries and detect out-of-domain queries.
03
Optimized LLM call
For medium confidence scores, ByteRover prefetches top documents and issues a constrained LLM call with a 1,024 token budget and temperature 0.3.
04
Full agentic loop
For hard or novel queries, ByteRover runs a multi turn agentic loop with tool calls over the Context Tree, up to 50 iterations and 2,048 tokens.

KEY CONTRIBUTIONS

Key Contributions

01
Agent-native memory architecture
ByteRover makes curate and search tools first class in the Agent Layer, eliminating external pipelines and enabling a stateful feedback loop for memory operations.
02
Context Tree with Adaptive Knowledge Lifecycle
ByteRover introduces the Context Tree with importance scores in [0,100], maturity tiers from draft to core, and recency decay with τ = 30 days.
03
5-tier progressive retrieval strategy
ByteRover designs a 5 tier retrieval pipeline that resolves most queries at sub 100 ms without LLM calls and signals out of domain queries explicitly.

RESULTS

By the Numbers

Overall accuracy LoCoMo

96.1%

+6.2 over HonCho

Multi-Hop LoCoMo

93.3%

+9.3 over HonCho

Overall accuracy LongMemEval-S

92.8%

+0.2 over Chronos† low backbone

Cold query latency p50

1.6 s

LongMemEval-S end to end without external infrastructure

On LoCoMo, which stresses long range conversational memory, ByteRover reaches 96.1% overall accuracy versus 89.9% for HonCho. On LongMemEval-S, which spans 500 questions and 23,867 documents, ByteRover achieves 92.8% overall accuracy while maintaining low cold query latency.

BENCHMARK

By the Numbers

BENCHMARK

LLM-as-Judge accuracy on LoCoMo

Overall accuracy (%) on LoCoMo across memory systems.

BENCHMARK

LLM-as-Judge accuracy on LongMemEval-S

Overall accuracy (%) on LongMemEval-S across selected systems.

KEY INSIGHT

The Counterintuitive Finding

Routing all queries to the full agentic loop drops ByteRover from 92.8% to 63.4% on LongMemEval-S, a 29.4 point loss.

This is surprising because many developers assume more agentic reasoning always helps, but ByteRover shows structured tiered retrieval is far more reliable.

WHY IT MATTERS

What this unlocks for the field

ByteRover unlocks agent native memory where the same LLM curates, structures, and retrieves long term knowledge without external databases.

Builders can now ship autonomous agents with durable, inspectable markdown memories, strong benchmarks, and zero vector or graph infrastructure to maintain.

~12 min read← Back to papers

Related papers

BenchmarkAgent Memory

Active Context Compression: Autonomous Memory Management in LLM Agents

Nikhil Verma

· 2026

Focus Agent adds start_focus, complete_focus, a persistent Knowledge block, and an optimized Persistent Bash plus String-Replace Editor scaffold to actively compress context during long software-engineering tasks. On five hard SWE-bench Lite instances against a Baseline ReAct agent, Focus Agent achieves 22.7% token reduction (14.9M → 11.5M) while matching 3/5 = 60% task success.

arXiv:2601.07190 Read explainer

RAGBenchmarkAgent MemoryMemory Architecture

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Xingyu Lyu, Jianfeng He et al.

· 2026

ADAM combines Anchor extraction, Distribution estimation, Anchor selection, and Query generation to adaptively probe agent memory via an auxiliary generator and entropy based selection. On the EHRAgent benchmark with Llama2-7b-chat, ADAM reaches EQ=77 and ASR=1.00, compared to MEXTRA’s EQ=44 and ASR=0.89.

arXiv:2604.09747 Read explainer

BenchmarkMemory Architecture

AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents

Shannan Yan, Jingchen Ni et al.

· 2026

AdaMem organizes dialogue history into Working Memory, Episodic Memory, Persona Memory, and Graph Memory coordinated by a Memory Agent, Research Agent, and Working Agent. On LoCoMo with GPT-4.1-mini, AdaMem achieves 44.65 F1 overall, beating the best baseline LangMem at 41.76 F1 by +2.89.

arXiv:2603.16496 Read explainer

Questions about this paper?

Answers use this explainer on Memory Papers.

Checking…