OpenClaw’s Discord Message-Loss Bug Shows Tool Failure Is a UI State Problem

OpenClaw’s Discord Message-Loss Bug Shows Tool Failure Is a UI State Problem

The most corrosive agent bug is not the one where nothing happens. It is the one where the answer appears, then disappears. OpenClaw issue #83831 is in that category: a Discord partial-streaming failure where an assistant’s final text renders briefly, then vanishes after a tool-call failure warning lands. The model did produce the answer. The runtime did put it in front of the user. Then the channel lifecycle appears to have treated a later warning as more authoritative than the final response.

That makes this a better story than “Discord adapter bug.” It is a compact example of why tool failure in agent systems is a UI-state problem, not just a backend event. Agents do not merely return strings. They stream drafts, call tools, surface warnings, edit previews, emit progress cards, commit transcript entries, and deliver final messages through transports with their own edit/delete rules. If those surfaces share mutable state without a hard ownership model, a recoverable tool error can become message loss.

A failed command should not erase a successful final answer

The reported environment is specific: OpenClaw 2026.5.18 (50a2481), macOS 26.3.1 on arm64, Node 25.8.1, Discord guild text channel, Pi runtime, and openai-codex/gpt-5.5. The gateway log shape included provider=openai-codex/gpt-5.5 harness=pi configured=unspecified. Discord was configured for partial streaming with channels.discord.streaming.mode = partial.

The reproduction is deliberately boring. Ask the assistant to run an invalid command — openclaw definitely-not-a-real-subcommand — and still send a normal final text response. The shell failure is unsurprising: zsh:1: command not found: openclaw. The assistant final text, delivery survived, appeared momentarily. Then it vanished. The only visible result left in Discord was the warning: ⚠️ 🛠️ run openclaw definitely-not-a-real-subcommand (agent) failed.

That distinction matters. If the command failed and no answer was generated, users know what happened. If the assistant refused or timed out, users can retry. But when a final answer appears and then disappears, the user cannot tell whether generation failed, Discord deleted it, a tool handler overwrote it, a partial preview was cleaned up incorrectly, or final delivery was never committed. It breaks the one thing an operator needs during debugging: a stable observation of what the system did.

The issue is labeled P1, impact:message-loss, clawsweeper:source-repro, clawsweeper:fix-shape-clear, clawsweeper:queueable-fix, and issue-rating: 🦞 diamond lobster. The useful triage signal is that ClawSweeper did not collapse this into “the tool failed.” It kept the issue open and identified a source-supported path where a finalized Discord partial-stream preview can be deleted when a later tool-error warning payload is delivered. That is the right diagnosis category.

The invariant should be boring

The invariant OpenClaw needs here is simple: once a final answer has been delivered durably, later tool-status cleanup should not erase it. Tool warnings can be adjacent messages, progress cards, attached diagnostics, or transcript entries. They should not supersede a final assistant text unless the runtime explicitly marks that final answer invalid. If cleanup is necessary, clean up the preview artifact, not the durable final response.

This is where “vibe debugging” becomes operations. The visible symptom is weird, but the underlying class is common: multiple producers believe they own the same UI artifact. The assistant final answer, the partial-stream preview, and the tool-failure warning are all valid events. The bug is that the channel adapter appears to have a lifecycle path where a later warning can win over final text. That is not a model-quality issue. GPT-5.5 did not forget the response. The delivery system lost track of which artifact was allowed to be mutable.

Nearby OpenClaw work reinforces the pattern. PR #83734 fixed Control UI live tool cards for externally started runs by routing session.tool Gateway frames through the existing tool-stream handler. Different surface, same product expectation: users should see tool progress accurately without needing a history reload or losing final state. Issue #83832, by contrast, was closed as already implemented after hashed shipped runtime chunks showed the alleged missing Mattermost progress path existed. That contrast is useful. Some progress reports are artifact-inspection mistakes; #83831 looks like a real channel lifecycle bug.

For practitioners, the action item is not to memorize this Discord edge case. It is to add regression tests for the exact shape: tool fails, assistant final text exists, channel uses partial streaming, warning is delivered, final reply survives. Run that across Discord, Slack, Telegram, Mattermost, and WebChat, because they are not the same transport with different logos. Each has different edit/delete semantics, message-length limits, preview behavior, and retry windows.

For platform implementers, separate durable responses from transient status surfaces. Treat progress messages as progress messages. Treat final assistant output as final assistant output. Store enough IDs and state to know which Discord message is a draft, which is a final, and which is a diagnostic. If a later warning has to be displayed, it should append context, not rewrite history.

The broader lesson is that agent UX needs idempotency and ownership at the conversation layer. Tool failure is recoverable. A UI that erases the answer after showing it turns recovery into archaeology. That is the wrong trade.

Sources: GitHub issue #83831, OpenClaw v2026.5.19-beta.1, PR #83734, issue #83832