OpenClaw’s Session-Key Race Is the Multi-Agent Memory Bug Every Chat API Eventually Has to Face

OpenClaw’s Session-Key Race Is the Multi-Agent Memory Bug Every Chat API Eventually Has to Face

A session key in a stateful agent system is not metadata. It is a lock, a routing contract, and a promise that the next turn belongs to the same conversation. Issue #84575 is a clean example of what happens when that promise gets even slightly fuzzy: two concurrent OpenAI-compatible chat requests using the same x-openclaw-session-key can produce a second response that behaves like a fresh isolated session. The user experience is brutally simple: “the assistant forgot everything when I double-tapped send.”

That is the kind of bug that sounds small to API designers and catastrophic to anyone building a real assistant. A stateless chat endpoint can shrug at concurrency. A stateful agent cannot. Agents carry pinned facts, tool results, child work, delivery routes, memory summaries, approval state, and half-finished plans. If two turns with the same key are allowed to run as separate worlds, the model may still answer fluently. It is just answering from the wrong universe.

OpenAI-compatible JSON is not enough

The reported failing path is OpenClaw’s OpenAI-compatible /v1/chat/completions endpoint. The external chat surface, called finn, derives a stable key from (agentId, channelId) in the form agent:<agentId>:finn:<channelId>. That matters because the reporter did the first thing maintainers usually ask for: they verified the upstream router’s key derivation is deterministic. The same logical conversation is sending the same key. The suspicion therefore moves back into OpenClaw’s request lane handling.

The observed failure shape is exactly the one serious chat products must defend against. The first prompt starts a long, multi-tool turn. While it is still streaming or processing, the user sends a follow-up. The second request receives a reply, but the agent inside that run appears to have no memory of prior context. One observation showed session_status reporting Context: 0/1.0m (0%) despite roughly 15 prior turns normally loading on the first request. That is not a cosmetic counter problem. It is a sign that the session scope may have split under concurrency.

ClawSweeper’s initial triage was appropriately cautious. The issue is open with P2, impact:session-state, and needs-info, not treated as a confirmed platform-wide meltdown. The bot noted that current main already serializes common embedded and ACP runtime paths by session key, so the remaining question is whether the OpenAI-compatible API/session-key lane bypasses that serialization. That is the right posture: the user-level symptom is convincing, but the fix needs a narrowed reproduction proving which ingress path is missing the lane discipline.

Rapid-send UX turns into memory semantics

The practical fix is conceptually boring: serialize per session key. If a request arrives while another turn for that key is active, the runtime should queue it, reject it with a clear retryable conflict, or explicitly coalesce it into the active turn if the product supports mid-turn steering. What it must not do is silently create an isolated run that happens to share a key-shaped string. Silent isolation is worse than refusal because it produces a plausible answer while violating the conversation contract.

This is where “OpenAI-compatible” gets more complicated than matching request and response schemas. The OpenAI API shape was designed around HTTP requests that can be retried, streamed, and scaled behind infrastructure that often treats calls independently. An agent platform exposing an OpenAI-shaped endpoint is offering something stronger: a stateful conversation surface with memory continuity. That means compatibility includes session-key semantics, active-turn behavior, ordering, idempotency, and observability. If those are absent, the endpoint is JSON-compatible but product-incompatible.

For builders integrating OpenClaw behind custom routers, the checklist is immediate. Send deterministic session keys, but do not stop there. Fire a long-running prompt, send a second message 30 to 60 seconds later, and verify the second turn sees the same pinned context, prior tool results, and memory scope. Log the external channel id, OpenClaw session id, request id, and session key together. You should be able to answer whether the second request was queued, rejected, coalesced, or accidentally isolated. If your UI allows rapid-send, add an “agent is still working” affordance instead of letting users stack messages into a race and hoping the backend sorts it out.

For platform maintainers, the design question is not only “add a mutex.” It is where the mutex lives and how it reports. Native channels, ACP sessions, OpenAI-compatible API, cron, subagents, webhooks, and future router layers should not each invent their own active-turn semantics. A session key should map to one authoritative lane. The lane should expose state: active, queued, rejected, interrupted, completed, failed. It should also preserve ordering across delivery and transcript writes. Otherwise the platform will fix one ingress path while another quietly reintroduces split-brain memory.

This issue also connects to the nearby delivery problems in #84053, #84489, and PR #84371. Subagents can finish work but fail to notify parents. Generated media can complete without reaching the requester. OpenAI-compatible turns can possibly share a key but lose context. Different surfaces, same underlying theme: agent systems need strong lifecycle contracts. Work is not merely “accepted.” It belongs to a session, a principal, a channel, and an ordered state machine.

The editorial takeaway is that memory is a concurrency primitive. We talk about agent memory as retrieval, embeddings, summaries, and long context windows. Those matter. But before any of that helps, the runtime has to guarantee that two messages addressed to the same conversation actually enter the same conversation. If double-tapping send creates a new universe, your memory layer is not failing because it forgot. It is failing because the platform let the conversation fork without telling anyone.

Sources: OpenClaw issue #84575, issue #84489, issue #84053, PR #84371