openclaw

OpenClaw’s Codex Secrets-Audit False Positive Is a Small Bug With a Big Governance Lesson

Anatoliy Kolodkin

19 May 2026 • 4 min read

A noisy secrets scanner is not a harmless annoyance. It is a training program for ignoring alarms. Issue #84376 is small in direct severity — OpenClaw is flagging a non-secret Codex sentinel as plaintext — but the governance lesson is much larger. Security tooling only works if operators can afford to keep it enabled.

The bug: openclaw secrets audit reports the Codex provider’s codex-app-server marker as PLAINTEXT_FOUND in every Codex-configured agent. That string is not an API key. It is @openclaw/codex’s CODEX_APP_SERVER_AUTH_MARKER, used after the v5.18 Codex provider migration. The issue reports that a 25-agent BOS produced 19 false Codex findings; the audit summary showed plaintext=33, unresolved=0, shadowed=0, and legacy=2. The remaining 14 findings were attributed to a separate $env.<NAME> false-positive class tracked elsewhere.

That is the kind of output that breaks CI gates, pre-commit hooks, and human trust at the same time. If an operator sees 33 plaintext findings and already knows most are false positives, the scanner stops being a control and becomes a chore. The next real leaked key now has to compete with alert fatigue created by the tool that was supposed to catch it.

Sentinels are not secrets, but suppressions can become vulnerabilities

The obvious fix is to teach isNonSecretApiKeyMarker that codex-app-server is a marker. That is probably necessary. It is not sufficient as a design principle. Security scanners should not globally ignore arbitrary strings just because one official plugin uses them as sentinels. If an untrusted plugin can cause the audit system to suppress a value by choosing a familiar-looking marker, the fix creates a new blind spot.

ClawSweeper’s review makes the better distinction: trust the marker through the official Codex plugin boundary or an equivalently narrow Codex-owned contract. That is how audit semantics should work in agent platforms. The scanner needs to classify values by source and meaning, not by regex anxiety. A real token, a SecretRef, an environment indirection, a bundled provider sentinel, an external plugin marker, and a stale backup secret are different things. Treating all of them as “strings in apiKey fields” is how you either miss leaks or drown operators in fake ones.

The sample finding in #84376 shows the current failure clearly: code PLAINTEXT_FOUND, severity warn, JSON path providers.codex.apiKey, provider codex, and message “models.json provider apiKey is stored as plaintext.” That warning is semantically wrong for the Codex app-server marker. The field name says apiKey, but the value’s meaning is “use the Codex app-server auth path,” not “spend this credential.” Agent runtimes need that distinction baked into their credential model, not patched one exception at a time forever.

The reason this matters more in agent systems than ordinary web apps is scale and replication. OpenClaw stores provider state across agents. A single migration bug can multiply across a fleet, as the issue author’s 19 false Codex findings show. Personal agents, channel agents, subagents, cron agents, and local coding agents may all inherit the same provider catalog shape. One bad audit classification becomes fleet-wide noise.

Audit breadth and audit precision have to ship together

The timing is useful because related PR #84380 moves the audit surface in the opposite direction: it expands scanning to openclaw.json.* and openclaw.json.bak snapshots for plaintext secret targets. That is the right security direction. Active config is not the only place secrets live. Backup files, crash snapshots, timestamped copies, agent-local models.json, and migration leftovers are where old credentials go to avoid being noticed.

But broader scanning only works if precision improves at the same time. If OpenClaw expands from active config to backups while still misclassifying official sentinels, the number of warnings goes up and the percentage of useful warnings goes down. That is how security tooling becomes theater: louder, wider, more impressive in a dashboard, and less likely to change behavior.

For practitioners, the next step is not to disable secrets audits. It is to make them structured. Run openclaw secrets audit --json. Group findings by code, jsonPath, provider, and file path. Separate known sentinels from real credentials. Do not globally suppress PLAINTEXT_FOUND. If you must suppress, suppress documented non-secret markers narrowly and keep the suppression file versioned with comments. Scan backup configs and agent-local model files, not just the active root config. In CI, preserve the full JSON output so truncation does not hide the one finding that mattered.

The platform-level improvement should be more ambitious than “ignore this string.” Provider auth state needs explicit types. A field should be able to say: this is a SecretRef; this is an env var marker; this is a non-secret official plugin sentinel; this is plaintext and should fail; this is legacy and must be migrated. Once that contract exists, scanners become validators of typed credential state instead of pattern matchers guessing from field names.

This also belongs in every coding-agent security checklist. The industry likes the dramatic parts of agent governance: sandboxing, tool permissions, prompt injection, MCP trust, browser isolation. Fine. But credentials leak through boring lifecycle paths too: auth-provider migrations, backup files, local catalogs, stale generated configs, and false-positive suppression wrappers that nobody reviews. If an agent can read a config, a transcript, or a backup, those files are part of the attack surface.

The editorial take: a secrets scanner that cries wolf across 19 agents is weakening the security posture it was supposed to protect. OpenClaw’s fix should be narrow, source-aware, and official-plugin-specific. The larger lesson is not narrow at all. Agent runtimes need precise credential semantics, because regex-shaped security does not survive contact with fleets of autonomous tools.

Sources: OpenClaw issue #84376, OpenClaw PR #84380, OpenClaw issue #53998, OpenClaw PR #83603

Sentinels are not secrets, but suppressions can become vulnerabilities

Audit breadth and audit precision have to ship together

Sign up for more like this.