OpenClaw’s Idle Background Task Drop Shows Why Agent Notifications Need a Durable Wake Path

OpenClaw’s Idle Background Task Drop Shows Why Agent Notifications Need a Durable Wake Path

Agent reliability failures tend to look trivial from the outside because the visible symptom is so small: a missing Telegram message, a user asking “what happened?”, a completed job that apparently vanished into the machinery. That is exactly why OpenClaw issue #89641 is worth taking seriously. The bug report is not about whether a background task ran. It ran. The problem is that the runtime knew it ran, wrote the completion into session state, and still failed to wake the channel that the human was actually using.

That distinction is the line between a chatbot and an operations system. A chatbot can answer when spoken to. An agent that accepts background work has a stronger contract: it must report completion, failure, partial progress, and blocked states without forcing the user to poll it like a flaky CI job. If the only way to discover a completed task is to ask the agent later, the runtime has turned automation into suspense.

The report was filed on June 3, 2026 against OpenClaw 2026.4.24 stable, running on macOS 25.5.0 arm64 with a Telegram direct-message channel and the claude-cli / claude-sonnet-4-6 runtime. The failing path is specific enough to be useful: a Bash tool call runs with run_in_background: true, the user turn ends, the session becomes idle, the task later completes, a task-notification event fires, and OpenClaw processes that notification internally. What does not happen is the part users care about: no outbound Telegram reply is recorded until the user manually asks for status.

The reporter captured two reproductions in session b307c9f1. In the first, the agent answered at 08:17 KST and the session went idle. At 08:19 KST, task bk7n9jql1 completed and fired a task notification. No Telegram message appeared. At 08:28 KST, the user had to ask “what happened?” In the second incident, the agent answered at 09:02 KST, translation and encode tasks completed around 09:06 KST, no Telegram message arrived, and the user again had to ask for status at 09:09 KST.

The active-turn path is not a notification system

The most important detail in the report is that active-turn behavior works differently. If the task notification arrives while a user turn is still active, the agent can fold the completion update into its reply. That is convenient, but it is not a durable notification architecture. It works because the model loop already has a response path open. Once the session is idle, the runtime needs a separate wake-and-deliver path that does not depend on an active conversational turn.

That path has several jobs, and each one should be observable. It has to receive the event, identify the parent session, recover the requester and channel target, decide whether the completion can be sent directly or needs a model turn, deliver externally, mirror the delivery into local transcript state, and record enough metadata that operators can prove what happened later. A JSONL line saying “notification processed” is not sufficient if the human never received the message. Local state is bookkeeping. External delivery is product behavior.

This is why the issue belongs in the same family as OpenClaw’s recent delivery and sub-agent notification bugs, but should not be collapsed into them. Related issues #44925, #83430, and #84053 cover sub-agent completion loss, media-generation wake failures, and silently dropped background sub-agent notifications. Issue #89641 is narrower: it is about idle Bash task notifications in a Telegram DM with Claude ACP. Treating that as a Telegram-only wart would be the wrong fix. The abstraction should work for Slack, Discord, CLI sessions, cron, spawned sub-agents, media generation, shell tasks, and any future runtime that can finish work after the user turn closes.

Practitioners should instrument completion, not just execution

Engineers building on OpenClaw should take a concrete lesson from this: task completion is not complete until notification delivery is accounted for. Track the task ID, tool name, run mode, parent session, requester channel, start time, completion time, delivery attempt time, delivery result, transcript mirror result, and whether the user was notified. If a background task completes and no outbound notification is recorded after a short threshold, that should be alertable. “The work succeeded but nobody found out” is an incident, not a harmless logging gap.

There is also a product lesson. Users delegate background work because they want to stop babysitting the agent. If they must ask for updates manually, they will adapt by not trusting the feature. They will keep terminal windows open, set their own timers, or avoid background mode entirely. The cost is not just one missed Telegram message. It is the slow erosion of confidence that the system can be left alone.

For platform maintainers, the likely shape of the fix is a durable wake path with delivery semantics, not another special case in a channel adapter. The runtime should persist a completion event, enqueue or otherwise schedule delivery to the original requester, use idempotency keys where external side effects are possible, and record both external send and transcript mirror outcomes separately. If delivery fails, retry the delivery. If mirroring fails after delivery, do not pretend the user was not notified. The two states are related, but they are not the same state.

ClawSweeper kept #89641 open as a P1 issue with impact:message-loss and impact:session-state, which is the right severity framing. Message loss is not cosmetic in an agent runtime. The message is often the only visible proof that autonomous work completed. Without it, a successful background task becomes indistinguishable from a stalled one until the user interrogates the system.

The editorial read is simple: starting background work is easy. Proving that completion reaches the right human after the session goes idle is the product. If OpenClaw wants background agents to feel dependable, idle notification delivery needs to be a first-class runtime contract, not a fortunate side effect of an active chat turn.

Sources: GitHub issue #89641, GitHub issue #44925, GitHub issue #83430, GitHub issue #84053, GitHub PR #89640