Agency3 July 20264 min read
Your failing agents waste most of their tokens after the warning signs
When an agent run is going to fail, it usually shows signs early, then keeps spending. A 2026 study found that on warned failed runs, 58.1% of tokens were spent after the first warning. That's compute you paid for on a run you could have already known was doomed.
Short version: When an agent run is headed for failure, it rarely fails out of nowhere. It shows signs, loops, burns budget, stops making progress, and then keeps spending tokens until it gives up or returns something useless. A 2026 study put a number on the waste: on failed runs that had triggered a warning, 58.1% of the tokens were spent after that first warning, on average. More than half the cost of a doomed run is incurred after the point you could have known it was doomed. The fix is not a better final-answer eval. It is watching the run as it happens and acting on the warning.
You have paid this tax. An agent goes in circles, re-runs the same failing tool, drifts off the task, and your token meter keeps climbing the whole time. At the end you get a bad answer and a bill for the full run, including the long tail after it was clearly lost.

