OpenClaw's Update Runner Can Succeed While the Gateway Keeps Running Old Code

OpenClaw's Update Runner Can Succeed While the Gateway Keeps Running Old Code

An updater that installs the new package but leaves the old gateway running has not finished updating. It has created version skew with a success message.

That is the uncomfortable lesson in OpenClaw issue #79577, which reports an attended gateway.update.run from 2026.5.6 to 2026.5.7 where the global package upgrade succeeded, doctor completed, and runtime status claimed the update was okay — but the live gateway process stayed on gateway.self.version=2026.5.6. The CLI and installed package said one thing. The long-running agent process serving requests said another.

This is not a cosmetic discrepancy. OpenClaw gateways hold credentials, channel bindings, memory state, plugins, cron jobs, approval paths, and user-facing delivery surfaces. If the gateway keeps running old code after an update, the operator does not know which security fixes are active, which migration logic actually ran, or whether the process they are talking to matches the package they just installed.

The restart state machine is the story

The reported environment is specific: macOS Darwin 25.4.0 on arm64, Node v25.9.0, a global npm install under /opt/homebrew/lib/node_modules/openclaw, and gateway bound to 127.0.0.1:18789. The upgrade path was from 2026.5.6 to 2026.5.7. Preflight looked healthy: openclaw update status --json saw the latest release, openclaw config validate passed, and openclaw update --dry-run planned package update, plugin sync, completion refresh, gateway restart, and doctor checks.

The update runner then did the package work. The issue says global update exitCode=0, global install swap exitCode=0, stdout included “replaced openclaw,” and openclaw doctor --non-interactive --fix exited cleanly. But the restart path went sideways. Gateway logs recorded update.run completed ... restartReason=update.run status=ok, then signal SIGUSR1 received, then an error: SIGUSR1 restart ignored (not authorized; commands.restart=false or use gateway tool).

That would be bad enough if it simply failed the restart. The worse part is what happened next. A later first-class gateway.restart request was apparently coalesced as “already in-flight,” leaving the process alive on the old version. The post-update verification showed mixed state: runtimeVersion: "2026.5.7", gateway_reachable: true, gateway_self_version: "2026.5.6", and the same active gateway PID. ClawSweeper’s source review reportedly kept the issue open and described the state machine as source-reproducible: an emitted restart token can remain in-flight if the SIGUSR1 handler rejects authorization, causing later valid restarts to coalesce instead of retrying.

Package managers are not deployment systems

The operational mistake here is subtle but common. Installing a new package is not the same as deploying a new service instance. For CLI-only tools, those two states are close enough. For agent gateways, they are miles apart. A gateway is a daemon with live sessions, active channels, cached runtime state, and in-flight work. Its version is whatever the process is executing, not whatever npm put on disk.

That distinction matters even more for OpenClaw because releases are landing frequent permission, tool-call, delivery, and routing fixes. Version 2026.5.7 tightened Active Memory admin scope, inline skill tool dispatch, native command owner enforcement, Codex approvals, and delivery truthfulness. If a system installs that package but keeps serving 2026.5.6, patch velocity has not translated into patch adoption. The operator may believe they are protected by fixes that are not actually running.

For practitioners, the immediate action is to make live version verification part of every update checklist. Do not stop at package version. Compare CLI version, installed package version, gateway self version, process PID, process start time, and restart outcome. If those diverge, treat the update as incomplete. A useful health check should fail loudly when the live gateway version does not match the expected runtime version after an update.

The updater should also be pessimistic in its own reporting. If package replacement succeeds but restart authorization fails, the update result should not be “ok” with a footnote buried in logs. It should say: package installed, gateway restart failed, live gateway still old. That is the state the operator needs. Anything softer trains people to trust the wrong layer.

For maintainers, the bug suggests a concrete fix shape. Restart tokens need lifecycle semantics: emitted, consumed, rejected, cleared, retried, or failed. If a signal-based restart path is rejected for authorization, it should not leave an in-flight token that blocks a later authorized gateway-tool restart. Coalescing is useful for avoiding duplicate restarts; it is dangerous when it turns a rejected restart into a permanent “already handled” state.

The bigger category lesson is that agent platforms are becoming deployment systems whether they planned to or not. Once a gateway owns tools, memory, scheduled work, and channels, update semantics are part of the product. Users need to know not just what version is installed, but what version is serving, whether migrations ran against the live runtime, and whether a restart completed. Anything less is wishful ops.

The editorial take: this is not a macOS updater oddity. It is a deployment-contract bug. In an agent platform, “installed” and “running” are different facts. If the gateway did not restart, the update did not finish.

Sources: OpenClaw issue #79577, OpenClaw v2026.5.7 release, OpenClaw gateway docs, OpenClaw issue #68327