Do more tokens improve AI agent accuracy?

No, past the few lines a task actually needs, more tokens usually lower accuracy instead of raising it. A long context buries the relevant lines ("lost in the middle"), so the agent wanders and pays more per token for a worse answer. A 2026 ETH Zurich study found added repo context cut task success about 3% at over 20% more cost. The win is serving the right slice, not feeding more.

Updated 19 June 2026

Go deeper: read the full write-up on the blog.

More tokens, less signal

Attention is finite, and a model's usable attention is smaller than its context window. Pad the prompt with a whole project overview and the handful of lines that decide this task get buried under the ones that don't, the documented "lost in the middle" effect. The agent reads more, wanders more, and is more likely to anchor on the wrong detail.

What the research found

A 2026 ETH Zurich study (Evaluating AGENTS.md, arXiv 2602.11988) found that adding repository context files tended to reduce coding-agent task success by about 3% while raising inference cost over 20% versus no repo context. Only short, human-written constraint files helped. Bigger context windows don't reverse this; window size is not the same as usable attention.

Serve the slice, not the dump

The durable fix is retrieval: serve the currently-correct, scoped slice per task instead of stuffing the repo into every prompt. trovex does exactly this and cuts roughly 60% of the tokens per lookup in our own use: fewer tokens, and a better answer because the agent only reads what the task needs. tsukumo builds that context layer with your team.

Straight answers.

Does a bigger context window make an agent smarter?: Not by itself. A larger window lets you pass more tokens, but a model's usable attention is smaller than the window, relevant facts get lost in a long prompt. Filling the window often lowers accuracy and always raises cost.
Why would more context make an agent worse?: Two reasons. The lines that decide the task get buried under the ones that don't ("lost in the middle"), so the agent anchors on the wrong detail; and every extra token is paid for, so you get a worse answer at a higher bill.
What should I do instead of adding more context?: Retrieve the right slice per task instead of stuffing the repo. Serving the current canonical doc on demand (what trovex does, roughly 60% fewer tokens per lookup in our use) gives the agent exactly what the task needs.

Keep reading.