Agency17 June 20265 min read
GPT isn't enough: we wrap deterministic state machines around the LLM
The reliable parts of a production AI agent aren't in the model. They're in the deterministic code wrapped around it. Stop trying to prompt your way to correctness and start constraining what the model is allowed to do.

GPT isn't enough: we wrap deterministic state machines around the LLM
Short version: the parts of a production AI agent you can trust are not in the model. They are in the deterministic code wrapped around it. Most teams try to reach reliability by prompting harder or upgrading the model. That moves the probability of a correct answer; it never makes a wrong answer impossible. For work where being wrong costs real money, you want the opposite approach: let the model do the fuzzy part, and put a state machine in charge of everything that has to be correct. We build accounting AI for a Swiss fiduciary, where "confidently wrong" is the worst outcome there is, and this is the pattern that holds up.