loading
Loading.loading
Loading.Monitoring an AI agent isn't uptime monitoring. Agents are non-deterministic and multi-step, so you trace each run: every step, tool call, input/output, token cost, and failure mode — and you alert on cost spikes, error-rate spikes, and silent wrong-answers, not just crashes. Treat an agent run like a distributed trace. yoru is an open-source, self-host observability layer for agent fleets (public beta).
Updated
Traditional APM watches a deterministic service: latency, errors, throughput. An agent run is a sequence of decisions — it picks tools, retries, loops, and can return a confidently wrong answer with a 200 and no error. So the unit of observability is the run/trace, not the request.
Capture the full trace — every step, prompt, tool call, and intermediate output, so you can replay why the agent did what it did. Token cost, per run and per step, because cost spikes are the earliest signal something's looping or re-reading. Failure modes — wrong tool, infinite loop, truncated output, hallucinated result, stuck retries — which rarely throw, so you detect them by watching outcomes. Latency per step, to find the slow tool or the runaway chain. And the outcome itself: did the run actually accomplish the task? Pair the trace with whatever gate verifies the output.
Capture inputs, tool calls, and outputs for every run; keep a human at anything irreversible; alert on cost and failure-rate, not just exceptions. The goal is to answer “what is my fleet doing, what's it costing, and where is it quietly failing?”
yoru is an open-source, self-host observability project for agent fleets — you run it yourself (public beta, in active development). It's the observability pillar of a self-host suite; there's no hosted version. Use the concepts above with whatever tools you have; yoru is one OSS option you can run on your own infra.
or have us build it — same capability, the other door