Why can't you just prompt an LLM to be reliable?

Because prompting changes the probability of a correct answer, never the guarantee. For a task where being wrong costs money (posting to the wrong account, picking the wrong client), a 98% correct model is still wrong one time in fifty, silently. Reliability has to come from deterministic code that makes the wrong action impossible, not less likely.

What is a deterministic state machine in an AI agent?

It's ordinary code that defines the legal states and transitions for a task, and refuses anything outside them. The LLM proposes an action; the state machine decides whether that action is allowed from the current state and executes it. The model never writes to your systems directly. It makes a suggestion the deterministic layer is free to reject.

Doesn't constraining the model defeat the point of using AI?

No. You constrain what it can do, not what it can understand. The model still does the thing only a model can do well: turn messy human input into a structured intent. You just stop trusting it with the parts that have a single correct answer. The result feels flexible to the user and behaves like software.

When should a human stay in the loop?

On anything irreversible or hard to undo: a payment, a filing, a destructive change. The deterministic layer is what makes human-in-the-loop practical, because it can pause exactly at the irreversible transitions and nowhere else, instead of asking a human to babysit every step.

Deterministic state machines over LLMs: the reliable-agent pattern

tsukumo

Deterministic state machines over LLMs: the reliable-agent pattern · tsukumo

Short version: the parts of a production AI agent you can trust are not in the model. They are in the deterministic code wrapped around it. Most teams try to reach reliability by prompting harder or upgrading the model. That moves the probability of a correct answer; it never makes a wrong answer impossible. For work where being wrong costs real money, you want the opposite approach: let the model do the fuzzy part, and put a state machine in charge of everything that has to be correct. We build accounting AI for a Swiss fiduciary, where "confidently wrong" is the worst outcome there is, and this is the pattern that holds up.

Why doesn't a better LLM make your agent reliable?#

Because a model gives you a probability, not a guarantee, and the two are not close enough for money work.

A model that is 98% correct on "which client does this email refer to" sounds excellent. In a fiduciary processing thousands of documents, it means roughly one in fifty gets attached to the wrong client, silently, with no exception thrown. You cannot prompt that away. A better prompt takes you from 98% to 99%, which is the same problem one zero further out.

The mental shift is to stop asking "how do I make the model more accurate" and start asking "how do I make the wrong action impossible to execute." Those are different engineering problems, and only the second one ends.

The pattern: deterministic shell, probabilistic core#

The model sits in the middle and does one job: turn messy human input into a structured, checkable intent. Around it, ordinary deterministic code owns everything with a single correct answer.

The model reads an email and proposes: "book this invoice to client X, account Y."
The deterministic layer checks: is X a real, currently-active client in this context? Is Y a legal account for that operation? Is this transition allowed from the current state?
If any check fails, the action is rejected before it touches a system. The model gets to suggest. It never gets to commit.

This inverts the usual trust model. Instead of trusting the model and adding guardrails as an afterthought, you distrust the model by default and let it earn each action through code that cannot be sweet-talked.

What the state machine owns#

The discipline that makes this work is being strict about the boundary. Some things are the model's. Most things are not.

Who owns what

The model owns (fuzzy)	The state machine owns (must be correct)
Reading intent from messy text	Which client or entity the work applies to
Drafting a human-readable summary	Which IDs chain to which records
Proposing the next action	Whether a transition is legal from this state
Classifying a document type	Executing the write to your systems
Suggesting a category	Pausing at anything irreversible

If you find the model on the right-hand column, that is your bug. The most common failure we see in other teams' agents is letting the model hold a record ID across several turns and "remember" it. Models don't remember; they re-derive, and re-derivation is where the wrong ID sneaks in. The state machine holds the ID. The model is never asked to.

Where does your team actually stand on this? A short agent-ops assessment is the low-risk way to find out.

CapabilityIntercept: making the model's output safe to execute#

Between the model's suggestion and the actual execution, we run an interception layer. Its whole job is to resolve the fuzzy references in the model's output into exact, verified identifiers before anything runs.

Three things it handles that break naive agents:

Sticky client. Once a conversation is about a specific client, that binding is held in deterministic state, not in the prompt. The model cannot accidentally drift to another client mid-task because it never held the client in the first place.
Coreference. When the user says "that invoice" or "the same account as last time," the deterministic layer resolves the reference against actual records. The model flags that a reference exists; the code decides what it points to.
ID chaining. Operations that depend on each other (this payment belongs to that invoice belongs to this client) are chained by the interceptor against real foreign keys, not by the model stringing IDs together in text.

The model produces intent. CapabilityIntercept turns intent into a verified, executable command, or refuses it. That refusal is a feature. It is the moment a silent error would have happened and didn't.

Where humans stay in the loop#

The deterministic layer is also what makes human-in-the-loop bearable. Without it, "keep a human in the loop" means a person babysitting every step, which no one sustains. With it, the state machine knows exactly which transitions are irreversible and pauses only there.

A payment, a tax filing, a destructive change: those stop and wait for a human. Everything reversible runs on its own. You get oversight where it matters and speed everywhere else, because the code knows the difference. A model asked to decide "is this risky enough to ask a human" would, again, only give you a probability.

Prompt harder, or constrain harder?#

Two roads out of an unreliable agent. They lead to different places.

Two roads out

Prompt harder	Constrain harder
Reliability is a probability you keep nudging	Reliability is a property of the code
Each new edge case is a new prompt tweak	Each new rule is a new transition, tested once
Failures are silent and statistical	Failures are explicit refusals you can see
Hard to audit (why did it do that?)	Auditable (every action passed a known check)
Gets worse as the task space grows	Scales with ordinary software discipline

The second road is less exciting and far more durable. It is also, not incidentally, the road that lets a regulated business put an AI agent anywhere near its books.

What this means for your stack#

If you are building an agent for anything that touches money, records, or compliance, the model is the easy part. The work is the deterministic shell: the state machine, the interception layer, the explicit human-in-the-loop boundaries. That is where reliability is won, and it is ordinary, testable engineering rather than prompt alchemy.

We learned this building it for a fiduciary, where the cost of "confidently wrong" is measured in someone else's accounts. If your team is trying to get an AI agent from impressive-in-a-demo to trusted-in-production, the shell around the model is the work, and it's the work we do. Talk to us about your agent.

We map which actions your agent commits without a deterministic check, and where a single wrong ID could post to the wrong account.

Find out where your agent trusts the model too much

Book an assessment →

Short answer: How do you make AI coding agents reliable?.

GPT isn't enough: we wrap deterministic state machines around the LLM

Why doesn't a better LLM make your agent reliable?#

The pattern: deterministic shell, probabilistic core#

What the state machine owns#

CapabilityIntercept: making the model's output safe to execute#

Where humans stay in the loop#

Prompt harder, or constrain harder?#

What this means for your stack#

AI agents in production: the four operating problems that decide it

Your AI agent didn't fail the deploy. It stopped itself.

How we run a 9-agent growth team on wrai.th (and what broke)

Want this running on your team?