OpenClaw’s Cost Reporter Learned the Difference Between Free and Unknown

The most dangerous number in an agent platform is not a large bill. It is a confident zero.

That is the bug OpenClaw PR #85882 is trying to fix. The patch changes openclaw gateway usage-cost so token-burning turns from models with unknown or all-zero pricing metadata are no longer reported as complete $0 spend. Instead, they are surfaced as missingCostEntries. That sounds like bookkeeping until you see the production sample behind it: an idle host using openai/gpt-5.5 burned 1,780,235 tokens in a day while the cost report said totalCost: 0 and missingCostEntries: 0.

That is not a minor UI bug. It is a governance failure wearing a friendly price tag.

Unknown pricing should fail suspicious, not optimistic

The PR is intentionally small: two files, 85 additions, and no catalog heroics. src/infra/session-cost-usage.ts now checks whether the resolved model pricing is actually known. If a turn has nonzero token usage but the pricing path has no positive per-token rate and no tiered pricing, the aggregator stops trusting the transport’s fabricated zero and counts the turn as missing-cost instead. The test coverage lands beside it in session-cost-usage.test.ts.

That restraint is the right design choice. The runtime should not guess GPT-5.5 pricing, reverse-engineer provider bills, or pretend OpenRouter metadata is always fresh. It only has to preserve one invariant: $0 means free, not “we lost the price somewhere upstream.” PR #85882 gets that distinction back into the system.

The linked issue #85858 shows why this matters. One observed sample had 821,452 total tokens — including 813,056 cache-read tokens — with totalCost: 0 and missingCostEntries: 0. Another operator reported roughly US$11/day of idle spend hidden behind the same zero-cost path. The bug was not simply “Codex pricing is absent.” A community review traced the chain more precisely: toPricePerMillion() turns missing, null, or non-finite upstream pricing into 0; estimateUsageCost() then treats that as a defined zero; applyCostTotal() only treats undefined as missing. By the time the dashboard sees it, uncertainty has been normalized into certainty.

That is exactly the sort of bug agent runtimes are going to keep producing as they become operating surfaces rather than chat boxes. The model catalog changes faster than governance code. Providers ship new SKUs. Compatibility layers route requests through aliases. Local and remote models mix in the same session. If the cost pipeline has one bucket for “zero,” it will eventually confuse free, unknown, promotional, missing, misconfigured, and unpriced. Those are different states. Budget controls depend on the difference.

Cost observability is safety infrastructure

There is a temptation to treat cost reporting as administrative tooling: useful for finance, maybe handy for power users, but not core runtime safety. That framing is obsolete for agents.

A coding agent can loop. A heartbeat can regress. A subagent can keep replaying context. A provider fallback can silently move traffic from a cheap local model to an expensive hosted one. A compaction path can turn every message into a hidden token tax. In that world, spend is not just a bill; it is one of the earliest signals that the system is doing something unexpected. If token volume rises while user-visible output does not, something is wrong. If cache-read tokens spike on an idle machine, something is wrong. If cost is zero but tokens are nonzero, the dashboard should look suspicious, not serene.

That is also why this patch belongs in the “best AI coding agent” conversation, even though nobody will put it on a landing page. Model quality, context window, IDE integration, and tool breadth are the obvious comparison points. Mature operators will ask harder questions: can I audit usage by model? Can I distinguish unknown spend from free usage? Do budget guards work when provider metadata is stale? Does the system fail closed, fail noisy, or fail pretty?

OpenClaw’s previous behavior failed pretty. It gave operators a complete-looking report with the most reassuring number possible. PR #85882 makes the report uglier and more useful. That is progress.

There is one caveat in the PR that deserves attention: intentionally free all-zero models may also be counted as missing until the catalog gets an explicit “free” marker distinct from “unknown.” That is not a regression so much as an overdue schema lesson. A free model is not the same thing as a model with absent pricing data. The runtime needs a first-class state for both, and ideally a third for “provider says zero under current terms.” Pricing is policy, not just arithmetic.

Practitioners should not wait for this specific patch to merge before checking their own installs. If you run Codex, GPT-5.5, OpenRouter-routed models, beta providers, or any catalog entry that recently changed, inspect token totals alongside costs. Nonzero tokens with zero cost should trigger a manual audit. Look at idle periods, heartbeat windows, compaction runs, and background subagents. If you have budget automation that keys only off totalCost, add a guard for token volume and missing-cost entries. The bill is a lagging indicator; token movement is the runtime telling you what it is actually doing.

The broader editorial take is simple: agent platforms need accounting systems that tell the truth under uncertainty. A confident zero is worse than no number at all because it disables the human instinct to investigate. Cost observability should be designed like security observability: when evidence is incomplete, surface the gap loudly. OpenClaw is learning that lesson in public. Good. Everyone else building agent runtimes should copy the homework before their own “free” dashboard becomes an invoice.

Sources: OpenClaw PR #85882, OpenClaw issue #85858, OpenClaw issue #84675, OpenClaw issue #54157