openclaw

OpenClaw's Prompt-Clobber Bug Shows Why System Prompt Ownership Is a Runtime Boundary

Anatoliy Kolodkin

28 May 2026 • 4 min read

System prompts are still too often discussed like clever copywriting. In an agent runtime, they are closer to control-plane configuration: identity, policy, tool expectations, channel behavior, and the line between operator instruction and user-controlled context. That is why OpenClaw PR #87812 matters. It fixes a regression where active tool selection could clobber the intended OpenClaw assistant system prompt and replace it with the generic embedded coding-agent harness prompt.

The visible symptom was not subtle if you knew where to look. A live channel that should have started with “You are a personal assistant running inside OpenClaw” instead showed the model a prompt beginning “You are an expert coding assistant operating inside OpenClaw's embedded coding agent harness...” The bad prompt also said “Available tools: (none).” Same environment, same model, same workspace context, byte-identical skills block, different base prompt. That is not a styling regression. That is the runtime changing who the model thinks it is.

The issue, #87807, was filed May 28 and the PR followed minutes later. The reported environment included stable runtime 2026.5.26, dev/current runtime 2026.5.28, and github-copilot/claude-opus-4.7. Skills were explicitly ruled out; the visible <available_skills> section was byte-identical between good and bad runs. The difference was the base prompt path after tool activation.

The private-field shim finally met the runtime it could not control

The root cause is a migration bug with teeth. OpenClaw had internalized more of the embedded agent runtime. The old override helper still wrote fields like _baseSystemPrompt and _rebuildSystemPrompt, which looked plausible if you remembered the previous implementation. But the internalized AgentSession now used non-underscored fields. The shim was still doing work. It just was not doing work that controlled the actual session state.

The order of operations made the failure particularly sharp. attempt.ts applied the intended system prompt, then called session.setActiveToolsByName(...). That method rebuilt and wrote the session system prompt from the embedded coding-agent template. Tool activation became the moment the runtime forgot the surrounding OpenClaw identity and substituted the generic harness identity. The model still answered, which is exactly why this class of bug is dangerous. A hard crash would have been easier to notice.

PR #87812 adds a narrow exact base-prompt path. It introduces AgentSession.setBaseSystemPrompt(...), preserves exact base prompts across active tool changes, routes embedded attempts and embedded compaction through that exact path, and removes the old private-field shim. The patch spans 10 files with +186/-84 lines. Regression coverage checks that exact base prompts survive active tool changes, embedded-attempt prompts remain intact after tool activation, and existing embedded prompt construction still behaves as expected.

Maintainer verification reported 245 focused tests passing across 7 files, tsgo:core passing, git diff --check clean, and autoreview finding no accepted actionable findings. ClawSweeper still flagged a real P1 design risk: AgentSession is re-exported through the plugin SDK, so the new setter may become public plugin API unless maintainers explicitly contain or bless it. That concern is not pedantry. Prompt ownership is sensitive enough that the mutation path should be intentional, documented, and tested as a boundary.

Prompt ownership is runtime governance

The practitioner takeaway is not “be careful with prompts.” That advice is too vague to survive contact with a real system. The takeaway is: define who owns the system prompt after every runtime transition. Tool selection, compaction, replay, model switching, embedded harness setup, channel adaptation, and skill loading can all become prompt mutation points. If the intended prompt must survive those transitions, the invariant belongs in code and tests, not in a helper that pokes private fields and hopes upstream internals stay friendly.

This is especially important because system prompts now carry trust segmentation. Recent OpenClaw work has moved untrusted group prompt metadata out of system prompts and into structured untrusted context. That is the right direction, but it only works if the base prompt itself remains under the expected owner. If an internal runtime template can overwrite the assistant prompt during tool activation, then a safety boundary has quietly become order-dependent.

Operators should add prompt visibility to upgrade checks. Do not assume that because your config file, persona file, or skill list looks correct, the model received the intended base prompt. Capture the model-visible system prompt in staging when upgrading releases that touch embedded runtimes, tool activation, compaction, or session internals. Compare the actual prompt, not just the input files. The issue report’s strongest evidence was the boring kind: same inputs, byte-identical skills, changed runtime output.

Platform builders should avoid two traps. The first is treating system prompts as immutable prose when the runtime can rebuild them dynamically. The second is treating internal field writes as acceptable integration strategy. If prompt mutation is legitimate, expose a narrow method with semantics and tests. If it is not legitimate, make mutation impossible outside the owning layer. Half-private shims are where invariants go to become release notes.

The public API question around setBaseSystemPrompt deserves careful handling. Plugins probably should not be handed casual authority to replace the base prompt. But the internal architecture needs a first-class path because the alternative is worse: invisible mutations, stale shims, and order-of-operations bugs that change agent identity after tools are selected.

The opinionated read: #87812 is not a prompt-engineering anecdote. It is a governance fix. In an agent platform, “who owns the system prompt after tools change?” is a reliability and security question. If the answer is “whatever method most recently rebuilt the session,” you do not have prompt ownership. You have prompt drift with a friendly interface.

Sources: OpenClaw PR #87812, issue #87807, OpenClaw PR #85341, OpenClaw v2026.5.27 release notes

The private-field shim finally met the runtime it could not control

Prompt ownership is runtime governance

Sign up for more like this.