openclaw

OpenClaw’s OpenAI-Codex Auth Bug Shows Why Token Is Not a Credential Strategy

Anatoliy Kolodkin

21 May 2026 • 5 min read

There are few phrases in agent infrastructure more overloaded than “paste token.” In OpenClaw’s latest OpenAI-Codex auth bug, that ambiguity is the whole incident. A headless operator has a valid OpenAI API key, a configured Codex-backed model, and a normal hosted deployment where browser-based OAuth is not available. They run the obvious command, `openclaw models auth --agent main paste-token --provider openai-codex`, and the runtime dutifully stores something that the downstream Codex harness then rejects as an invalid OAuth ID token.

That is not a dramatic exploit. It is more useful than that. It is a small, sharp example of how agent platforms break when credential vocabulary gets treated as UI copy instead of runtime contract.

Issue #85145 was opened on May 22 against OpenClaw 2026.5.19 in a Docker deployment on Zeabur. The reported setup is exactly the kind production teams actually run: a single-pod headless OpenClaw instance, no browser available for OAuth, and a bring-your-own OpenAI API key intended to power an `openai/gpt-5.5` agent model through the `openai-codex` provider. The reproduction is almost painfully reasonable: pipe `OPENAI_API_KEY` into `openclaw models auth --agent main paste-token --provider openai-codex`.

The failure is equally precise. The CLI writes a credential profile with `type:"token"`. The Codex harness receives that profile through the external-auth path and interprets it as an OAuth ID token, not an API key. The result is `CodexAppServerRpcError: failed to set external auth: invalid ID token format`. The credential value is fine. The deployment assumption is fine. The operator’s intent is fine. The type system is not.

The bug is not the key. It is the noun.

The issue report does useful debugging work. Four credential shapes were tested. An `api_key` profile with an `api_key` field failed because the resolver reads `cred.key`. A `token` profile with a `token` field failed because the Codex harness treated it like an ID token. An `api_key` profile under provider `openai` failed because the `openai-codex` lookup uses exact provider matching before any compatibility bridge can help. The shape that worked was explicit: `{ "type": "api_key", "provider": "openai-codex", "key": "sk-…" }`, paired with OpenClaw config using `mode:"api_key"`.

That detail matters. API keys, OAuth access tokens, refresh tokens, ID tokens, and opaque provider-issued bearer strings are not interchangeable blobs. They have different audiences, expiration rules, validation paths, revocation semantics, and storage expectations. An ID token proves identity to a relying party. An API key authorizes use of an API. A refresh token can mint new access. A provider-specific session token might be valid only inside one harness. Collapsing all of that into “paste-token” is a convenience until the runtime has to decide what to do with the pasted bytes.

In an ordinary CLI tool, this would be a usability bug. In an agent runtime, it becomes governance debt. OpenClaw is not just saving a credential for a one-off command; it is wiring that credential into model selection, provider routing, channel replies, background sessions, and embedded Codex app-server calls. If the profile type is wrong, the failure can surface far away from the auth command that created it. That is how teams end up with the classic agent-platform bug report: “works in direct CLI inference, fails in Telegram,” or “works after manual OAuth, fails in Docker.”

Provider aliases are where auth bugs go to multiply

The OpenAI versus OpenAI-Codex provider split is the second lesson. The report notes that `harnessCanForwardProfile` can forward an `openai` API-key profile to the Codex harness, but only after `listProfilesForProvider("openai-codex")` returns a candidate. That means the compatibility logic exists, but too late in the lookup pipeline to save the headless case. Provider normalization is happening after provider selection has already discarded the credential that could have worked.

This is exactly the kind of seam agent frameworks accumulate as they grow. “OpenAI” can mean the public API. “Codex” can mean the embedded harness. A model ID like `openai/gpt-5.5` may be routed through a provider plugin, a local app-server, or a policy layer. Add OpenRouter, enterprise gateways, local proxies, Copilot-style providers, and remote/headless auth flows, and credential lookup becomes a graph, not a map. If that graph is not normalized early and tested across every runtime entry point, operators get brittle behavior that looks arbitrary from the outside.

The proposed fixes in the issue are sensible: make `paste-token --provider openai-codex` write the API-key profile when the pasted value is an API key, and/or add an explicit `paste-api-key --provider

` command distinct from OAuth token import. The latter is cleaner. It forces the operator and runtime to agree on what class of secret is being stored. It also gives documentation, validation, and error messages a better noun to hang onto.

There is a security angle here too, even though the immediate issue is reliability. Typed credentials make it easier to apply different handling rules. API keys should be redacted and rotated differently from OAuth refresh tokens. ID tokens should be validated for issuer and audience, not blindly accepted as bearer secrets. Headless deployments should be able to disable interactive OAuth expectations entirely. Audit logs should say “API-key profile selected for provider openai-codex” rather than “token present,” because the latter tells an operator almost nothing.

The practitioner takeaway is straightforward: verify the credential mode, not just the credential value. If you are running OpenClaw with Codex-backed models in Docker, on Zeabur, or anywhere browser OAuth is unavailable, inspect the auth profile. API keys should be stored as `type:"api_key"`, under the provider the runtime actually queries, with the field name the resolver actually reads. Do not assume that a command accepting “token” will preserve the semantic type of the secret you pasted.

The bigger product lesson is that agent runtimes need credential contracts as much as they need tool contracts. A model call failing because an `sk-…` key was parsed like a JWT is not the model being flaky. It is the platform asking a string to carry too much meaning. “Token” is a storage convenience. It is not an auth model.

OpenClaw’s maintainers have been cleaning up the Codex path quickly — the previous release window already included fixes for embedded and channel OpenAI-Codex OAuth resolution after direct CLI inference worked but channel replies failed. That context makes #85145 feel less like an isolated papercut and more like the next hardening step in a provider stack that is becoming operationally important. The right end state is boring: explicit credential modes, early provider normalization, headless-first commands, and tests that cover CLI, embedded, and channel execution paths together.

Until then, the safe rule is simple. Never let “paste token” be the whole design. Ask what kind of credential it is, who it is for, and which runtime path is allowed to consume it. Agent platforms that cannot answer those questions should not be trusted with production keys.

Sources: OpenClaw issue #85145, OpenClaw PR #85152, OpenClaw PR #84752

The bug is not the key. It is the noun.

Provider aliases are where auth bugs go to multiply

Sign up for more like this.