OpenClaw Just Fixed a Hidden Token Tax on Every Multi-Turn Session

Some of the most important OpenClaw fixes are the ones that barely look like product news. PR #67447, merged early on April 16, changes the platform's default context-injection mode from always to continuation-skip. In plain English, follow-up turns stop re-injecting the same bootstrap files into every turn by default. That sounds like minor prompt plumbing. It is actually a meaningful correction to how the runtime thinks about session continuity, token economics, and the price of pretending every turn starts from zero.

The linked issue estimated that bootstrap reinjection was consuming roughly 20 to 30 percent of context on session start, then compounding across longer chats. If you keep injecting stable files like SOUL.md, AGENTS.md, or MEMORY.md into every continuation, you are not just wasting tokens. You are imposing a tax on every multi-turn workflow, every background thread, and every user who thought “persistent session” meant the runtime would act like it remembered what was already there.

That waste matters more than it first appears because token burn is rarely just a cost story. It becomes a product-shape story. Repeated bootstrap payloads crowd out live user context, reduce room for tool output, and make long conversations more likely to compact or degrade earlier than necessary. The platform winds up spending expensive context on reasserting its own identity instead of carrying forward the work the user actually cares about.

The hidden tax was a semantics problem, not just a billing problem

It is tempting to read this fix as a straightforward optimization. Save tokens, move on. But the deeper issue was architectural. OpenClaw already had a continuation-skip path and tests for it. The platform simply still defaulted to the more paranoid always behavior. That suggests the runtime had not fully committed to its own notion of continuity. It was acting like the safest way to preserve behavior was to repeat the bootstrap on every turn, because session identity and guardrails were not trusted enough to stand on their own.

That is a classic orchestration smell. Mature systems do not re-announce the same invariant state every time they process a new event. They persist it, reference it, and rehydrate it only when needed. If your assistant needs to be reminded of its entire bootstrap on every continuation, the problem is bigger than prompt size. It means your session semantics are carrying uncertainty the expensive way.

To be fair, there is a real tradeoff here. The review on this PR raised exactly the right concern: if you skip reinjection too aggressively, edge cases around restarts, compaction, or bootstrap markers can leave the system without the guardrails operators expect. That is why this change is interesting. It is not reckless cost-cutting. It is OpenClaw deciding that the default should finally match the platform's intended session model, while still leaving room for operators to opt back into always when their environment needs belt-and-suspenders behavior.

Why builders should care

Every agent platform hits this wall eventually. Early on, stuffing the full bootstrap into every turn feels safe. It helps normalize behavior and masks weak persistence semantics. But once the system grows up into multi-turn, multi-agent, and background-heavy workflows, repetition becomes its own failure mode. Costs climb. Latency drifts upward. Long-context quality gets worse because static scaffolding keeps displacing live signal. The platform ends up paying a permanent prompt tariff because it never finished the job of making sessions feel real.

That is why this little default change has outsized significance. It tells you OpenClaw is starting to behave more like an operating environment and less like a stateless wrapper around repeated prompts. Persistent systems should preserve stable context by default and replay it selectively, not ritualistically. The continuation-skip default is a vote for that model.

There is another interesting angle here too. Token efficiency is becoming a competitive feature for agent platforms, but not in the shallow “save money” sense. Efficient platforms can allocate more room to useful work: tool traces, retrieval results, longer user instructions, model self-checking, or simply more conversational runway before compaction kicks in. If a 10-message conversation no longer reinjects bootstrap files nine extra times, the benefit is not just lower cost per run. It is a larger effective working memory for the tasks users actually want done.

What operators should actually do

If you run OpenClaw in production or near-production settings, this is a release-note item worth validating deliberately. Measure token usage before and after. Check whether your workflows rely on implicit bootstrap reinjection more than you realized. Pay special attention to sessions that compact, restart, or hand off between surfaces. The new default is probably the right one, but it should come with a sanity check on your most stateful flows.

Second, do not miss the meta-lesson. If your own agent stack is spending large chunks of context re-sending static instructions on every continuation, you probably have a session-design problem disguised as a prompt-engineering habit. Better persistence semantics usually beat larger repeated prompts.

Third, keep an eye on the surrounding review concerns. The Aisle bot's warning about restart or compaction gaps is not nonsense. Any time you reduce redundant safety scaffolding, you are betting that the underlying state model is sound enough to carry the load. That is usually the right bet for a mature runtime, but it is still a bet worth instrumenting.

OpenClaw did not just trim some tokens here. It corrected an assumption. The platform is finally acting like follow-up turns belong to an ongoing session, not a constant re-bootstrap ceremony. That is the kind of boring improvement that makes everything above it feel more competent: lower cost, longer runway, clearer semantics, less wasted context. Nobody will brag about it in a keynote. Everyone running long conversations should still care.

Sources: OpenClaw PR #67447, issue #67419, OpenClaw docs