How much does trovex save on tokens?

About 60% fewer tokens per doc lookup, measured on our own repo. Across 20 real queries against 145 docs we saved 59,668 of 89,585 tokens: a pooled 66.6%, a median of 71.1% per query, a mean of 67.1%. We publish 60% because it is the conservative end of what we measured. Per-query savings ranged from 45.5% to 79.1%. Your repo will land somewhere of its own.

Is the benchmark reproducible?

Yes, that is the whole point. The script ships in the trovex repo at benchmarks/token-savings/run.py and runs the same code path as the live /savings dashboard. Run it on this repo, or point it at any folder of markdown with --repo. It uses a local embedder, needs no API key, is deterministic, and leaves no state behind. Do not trust the number, run it.

What embedder does the benchmark use?

BAAI/bge-small-en-v1.5, run locally through fastembed. The first run downloads a roughly 90MB model once, then everything stays on your machine. No cloud, no API key. The savings ratio is governed by token volume, not by which embedder ranks the candidates, so the headline holds across embedders even though the exact ranking shifts.

Does this hold on my repo?

We do not know, and we will not pretend to. The 60% is measured on the trovex repo, not a universal guarantee. A repo with little overlapping markdown will save less. A repo thick with half-stale duplicates will save more. The benchmark exists so you can get your own number in about a minute instead of taking ours on faith.

Is this per-lookup or whole-bill savings?

Per-lookup, on doc reads. It measures the markdown-rereading slice of an agent's work, not the code it reads or the reasoning it runs. On a doc-heavy repo with a lot of agent traffic that slice is large. On a tiny repo it is small. The benchmark reports exactly that slice, so you can see whether it is worth fixing on your codebase.

How much do agents save by not rereading docs?

tsukumo

How much do agents save by not rereading docs? · tsukumo

Per-query tokens: triaging candidates vs resolving to one canonical doc

Query	Would read	With trovex	Saved	Reduction
quickstart getting started guide	6,609	1,317	5,228	79.1%
how is the token savings number calculated	3,813	1,074	2,673	70.1%
configure the embedding model	4,199	1,577	2,552	~61%
how do I install and set up trovex	6,823	3,651	3,102	45.5%

bash

python benchmarks/token-savings/run.py                  # this repo, local embedder
python benchmarks/token-savings/run.py --repo /path      # any repo of .md docs
python benchmarks/token-savings/run.py --json out.json   # machine-readable

One command on this repo, --repo for any folder of markdown, --json for machine-readable output

bash

trovex measure                                  # last 7 days vs the prior 7
trovex measure --baseline-days 14 --current-days 14

Before/after on your agent's own logged reads, not a synthetic benchmark

We benchmarked the token cost of rereading docs on our own repo

How much do AI agents waste rereading docs?#

How did we measure it?#

Can I reproduce this on my own repo?#

Can I measure it on my own agent's logs?#

Why does resolving to one canonical doc save tokens?#

Does this hold on every repo?#

How we think about it#

How we run a 9-agent growth team on wrai.th (and what broke)

Most agents hide it when their sources disagree

The canonical-doc layer the 7 agent-memory types miss

Want this running on your team?