loading
Loading.loading
Loading.The model API bill is the cost you see; the one that surprises you is context. An agent re-reads and re-derives your codebase every session, so most of the spend is tokens paid to re-learn what it already knew. Control it by giving agents a durable source of truth instead of re-reading the repo, and by making cost observable per task — so you tune a number, not a guess.
Not the clever output — the context. Every agent, every session, re-reading the same files to rebuild understanding it had yesterday. At fleet scale that repetition is the bulk of the bill, and it's invisible until you measure per-task cost.
Serve agents the current, canonical answer about your codebase on demand (this is what trovex does — roughly 60% fewer tokens per lookup in our own use) instead of having them re-derive it. Cheaper models help at the margin; cutting re-derivation moves the number.
Observability per task (what each run cost and produced) turns spend into something you tune. Without it, you only learn the bill at the end of the month and can't attribute it.
or have us build it — same capability, the other door