ByteRover: Agent-Native Memory Through LLM-Curated Hierarchical Context

AuthorsAndy Nguyen, Danh Doan, Hoang Pham et al.

2026

TL;DR

ByteRover uses an LLM-curated hierarchical Context Tree with a 5-tier retrieval pipeline to reach 96.1% accuracy on LoCoMo, +6.2 points over HonCho.

SharePost on XLinkedIn

Read our summary here, or open the publisher PDF on the next tab.

THE PROBLEM

External memory services cause semantic drift and fragile recovery

Existing Memory-Augmented Generation systems treat memory as an external service, so the system that stores knowledge does not understand it.

This separation leads to semantic drift, lost coordination context, and recovery fragility, where autonomous agents misremember tasks and cannot reliably reconstruct long-horizon state.

HOW IT WORKS

ByteRover: Agent-native Context Tree with Adaptive Lifecycle

ByteRover centers on an Agent Layer, Execution Layer, and Context Tree connected to MiniSearch and a query cache on the local filesystem.

You can think of ByteRover like a programmer who both writes and organizes code, instead of dumping snippets into a blind search index.

This design lets ByteRover maintain structured, evolving memory with explicit relations and lifecycle metadata that a plain context window or vector store cannot provide.

DIAGRAM

Five-tier retrieval pipeline in ByteRover

This diagram shows how ByteRover routes a query through cache checks, MiniSearch, and escalating LLM reasoning tiers.

DIAGRAM

Evaluation setup for ByteRover on LoCoMo and LongMemEval S

This diagram shows how ByteRover processes benchmark conversations and is scored with LLM-as-a-Judge against baselines.

PROCESS

How ByteRover Handles a Query with 5-Tier Progressive Retrieval

  1. 01

    5-Tier Progressive Retrieval

    ByteRover receives a query and computes a fingerprint while checking the query cache and fuzzy cache before touching the Context Tree or MiniSearch.

  2. 02

    MiniSearch full text search engine

    If caches miss, ByteRover uses MiniSearch with BM25, fuzzy matching, and field boosts to rank Context Tree entries and detect out-of-domain queries.

  3. 03

    Optimized LLM call

    For medium confidence scores, ByteRover prefetches top documents and issues a constrained LLM call with a 1,024 token budget and temperature 0.3.

  4. 04

    Full agentic loop

    For hard or novel queries, ByteRover runs a multi turn agentic loop with tool calls over the Context Tree, up to 50 iterations and 2,048 tokens.

KEY CONTRIBUTIONS

Key Contributions

  • 01

    Agent-native memory architecture

    ByteRover makes curate and search tools first class in the Agent Layer, eliminating external pipelines and enabling a stateful feedback loop for memory operations.

  • 02

    Context Tree with Adaptive Knowledge Lifecycle

    ByteRover introduces the Context Tree with importance scores in [0,100], maturity tiers from draft to core, and recency decay with τ = 30 days.

  • 03

    5-tier progressive retrieval strategy

    ByteRover designs a 5 tier retrieval pipeline that resolves most queries at sub 100 ms without LLM calls and signals out of domain queries explicitly.

RESULTS

By the Numbers

Overall accuracy LoCoMo

96.1%

+6.2 over HonCho

Multi-Hop LoCoMo

93.3%

+9.3 over HonCho

Overall accuracy LongMemEval-S

92.8%

+0.2 over Chronos† low backbone

Cold query latency p50

1.6 s

LongMemEval-S end to end without external infrastructure

On LoCoMo, which stresses long range conversational memory, ByteRover reaches 96.1% overall accuracy versus 89.9% for HonCho. On LongMemEval-S, which spans 500 questions and 23,867 documents, ByteRover achieves 92.8% overall accuracy while maintaining low cold query latency.

BENCHMARK

By the Numbers

On LoCoMo, which stresses long range conversational memory, ByteRover reaches 96.1% overall accuracy versus 89.9% for HonCho. On LongMemEval-S, which spans 500 questions and 23,867 documents, ByteRover achieves 92.8% overall accuracy while maintaining low cold query latency.

BENCHMARK

LLM-as-Judge accuracy on LoCoMo

Overall accuracy (%) on LoCoMo across memory systems.

BENCHMARK

LLM-as-Judge accuracy on LongMemEval-S

Overall accuracy (%) on LongMemEval-S across selected systems.

KEY INSIGHT

The Counterintuitive Finding

Routing all queries to the full agentic loop drops ByteRover from 92.8% to 63.4% on LongMemEval-S, a 29.4 point loss.

This is surprising because many developers assume more agentic reasoning always helps, but ByteRover shows structured tiered retrieval is far more reliable.

WHY IT MATTERS

What this unlocks for the field

ByteRover unlocks agent native memory where the same LLM curates, structures, and retrieves long term knowledge without external databases.

Builders can now ship autonomous agents with durable, inspectable markdown memories, strong benchmarks, and zero vector or graph infrastructure to maintain.

~12 min read← Back to papers

Related papers

Memory Architecture

Breaking the KV Cache Bottleneck: Fan Duality Model Achieves O(1) Decode Memory with Superior Associative Recall

Yasong Fan

· 2026

Fan Duality Model (FDM) uses the Fan Operator, Local-Global Cache, Freeze-Scan Training, and Holographic Reference Beam Decoding to separate wave-like compression from particle-like associative recall. On WikiText-103, Fan Duality Model (FDM) reaches 64.9 perplexity with Freeze-Scan and 62.79 with holographic decoding, while achieving 0.966 MQAR accuracy compared to Transformer at 0.606.

Memory Architecture

Codebase-Memory: Tree-Sitter-Based Knowledge Graphs for LLM Code Exploration via MCP

Martin Vogel, Falk Meyer-Eschenbach et al.

· 2026

Codebase-Memory parses repositories with a multi-pass pipeline using the Parse stage, Build stage, Serve stage, FunctionRegistry, Louvain communities, and MCP tool interface to build a persistent SQLite knowledge graph. On a 31-language benchmark, Codebase-Memory reaches 0.83 quality versus 0.92 for an Explorer Agent while using ten times fewer tokens and 2.1 times fewer tool calls.

Questions about this paper?

Paper: ByteRover: Agent-Native Memory Through LLM-Curated Hierarchical Context

Answers use this explainer on Memory Papers.

Checking…