OpenClaw’s Group-Context Hash Bug Is Agent Amnesia Caused by Volatile Identity

OpenClaw’s Group-Context Hash Bug Is Agent Amnesia Caused by Volatile Identity

Agent memory does not start with vector databases. It starts with the runtime knowing whether two turns belong to the same session.

OpenClaw issue #82812, filed on May 17 and labeled P1, is a sharp example of what happens when that foundation is wrong. The report says Feishu group messages — and any channel that injects fresh per-turn metadata — can force a new Claude/tmux CLI session on every consecutive turn because volatile context is included in extraSystemPromptHash. That hash is used to decide whether a CLI-backed session can be reused. If the hash changes, the runtime assumes the session contract changed and starts fresh.

The problem is that group chat context is supposed to change. Timestamps change. Message IDs change. Sender metadata changes. Recent room activity changes. Thread fragments change. A first-turn groupIntro may appear once and then disappear. Treating those values as part of session identity is like using the current weather as your database primary key. It will be unique. It will also ruin your day.

The reported path is concrete. get-reply-run.ts assembles extraSystemPromptParts, concatenates them into extraSystemPrompt, and passes that into prepareCliRunContext. The CLI path computes extraSystemPromptHash = hashCliSessionText(extraSystemPrompt). Then resolveCliSessionReuse invalidates reuse when the stored and current hashes differ. The dynamic inputs named in the issue include inboundMetaPrompt, groupChatContext, and groupIntro. The stable input is groupSystemPrompt.

The model looks forgetful because the runtime keeps moving its chair

This is easy to misdiagnose as an LLM memory problem. A user sends a follow-up in a group chat and the agent behaves as if it has lost continuity. Maybe it ignores the prior turn. Maybe the Claude/tmux bridge starts cold again. Maybe the failure gets worse and produces empty output with no useful transcript. The instinct is to blame the model or add more memory retrieval. But the root cause is lower in the stack: the session reuse mechanism is invalidating on context that was never meant to define identity.

Group agents need volatile context. That is the whole point. A good channel agent should know who spoke, whether it was mentioned, which thread it is in, what the recent room activity looked like, and which message it is replying to. But that information belongs in the current turn, not in the stable hash that decides whether the external CLI process represents the same conversation.

The fix shape proposed in the issue is the right one: split stable extra prompt material from dynamic turn context. Stable prompt material should include policy and configuration: system prompt, durable channel rules, auth profile or auth epoch, MCP configuration, and runtime settings that change the contract under which the CLI session operates. Dynamic prompt material should include inbound metadata, recent group context, message IDs, thread bodies, room snippets, and turn-specific introductions. The former can invalidate session reuse. The latter should be delivered to the running session as fresh context.

That distinction is not OpenClaw-specific. Any platform bridging chat channels into CLI tools needs at least three conceptual layers: stable identity, dynamic turn context, and historical transcript. Hash the wrong layer and one of two bad things happens. You either reuse a session after a real policy/auth/tooling change, which is unsafe, or you reset a session every time the room breathes, which makes the agent unreliable. The OpenClaw bug is the second failure mode.

Empty output should be treated as failure, not silence

The ClawSweeper review on the issue is useful because it separates what is source-backed from what still needs live proof. It confirmed that consecutive group turns can differ only by groupIntro while that text still feeds the reuse hash. It also kept the issue open because the empty Claude CLI output path needs explicit handling. That is the right evidence posture.

The empty-output part deserves attention on its own. An agent runtime should not treat “no transcript, no recovered pane output, no payload” as a successful response. Silence is not a valid final state unless the system deliberately chose not to answer and recorded why. Operators need diagnostics: run ID, session name, launch mode, invalidation reason, transcript path, pane tail size, retry attempt, and whether the bridge saw process exit, parser failure, timeout, or stale tmux residue.

This is where observability and memory meet. People talk about agent memory as retrieval quality, summarization, embeddings, and long-term profile storage. Those matter, but they sit above a more basic guarantee: the runtime must preserve continuity when the user expects continuity and reset only when the contract actually changes. If a Feishu room’s recent activity invalidates the CLI session hash, the memory layer is being sabotaged before it has a chance to help.

For practitioners building or operating channel agents, the action item is direct. Inspect what goes into your session identity hashes, cache keys, and reuse fingerprints. Strip out timestamps, message IDs, recent chat snippets, sender display names, thread excerpts, and other volatile context. Keep auth, policy, model runtime, tool configuration, and durable system-prompt changes. Then add tests that run two consecutive group turns where only dynamic context changes and assert the underlying CLI/session identity is reused.

The broader take is that agent amnesia often has a boring cause. Before adding another memory store, make sure your runtime is not invalidating the conversation every time someone speaks.

Sources: OpenClaw issue #82812, OpenClaw issue #69118, OpenClaw issue #81041, OpenClaw issue #82803