A One-Line DeepSeek Schema Bug Explains Why Tool Calling Still Breaks in Production

A One-Line DeepSeek Schema Bug Explains Why Tool Calling Still Breaks in Production

Tool calling is sold as a clean contract: define a schema, send it to the model, validate the result. OpenClaw issue #86468 is a useful reminder that production rarely gets the clean version. Between your plugin’s source schema and the model’s actual prompt-visible schema sits an adapter layer. When that adapter is wrong, the model is not “bad at tools.” It is being handed a broken map.

The bug is small enough to fit in a sentence. OpenClaw’s normalizeDeepSeekSchema() collapses JSON Schema anyOf unions of string const literals to only the first variant. A tool parameter with seven legal modes becomes a parameter with exactly one legal value: overwrite. The other six options exist in the plugin source, but disappear before DeepSeek sees them.

That is how multi-model support breaks in production: not with dramatic model failure, but with one compatibility transform quietly erasing most of a tool’s interface.

The plugin was correct; the provider schema was not

The reported environment is concrete: OpenClaw 2026.5.20, deepseek-v4-flash through an OpenAI-completions-compatible API, plugin openclaw-lark, tool feishu_update_doc. The tool defines a required mode parameter using TypeBox:

Type.Union([
  Type.Literal("overwrite"),
  Type.Literal("append"),
  Type.Literal("replace_range"),
  Type.Literal("replace_all"),
  Type.Literal("insert_before"),
  Type.Literal("insert_after"),
  Type.Literal("delete_range")
])

TypeBox correctly emits JSON Schema shaped like anyOf with multiple const string variants. That is a normal way to express enum-like parameters. The problem is OpenClaw’s DeepSeek normalizer. According to the issue, it skips copying anyOf, selects nonNullVariants[0], and merges only that selected variant into the normalized schema. The model-visible schema becomes essentially {"const":"overwrite","type":"string"}. Everything else is invisible.

The impact is predictable: valid modes fail validation with messages like mode: must be equal to constant, or the model keeps choosing the only mode it can see. That is not a model-reasoning problem. That is a contract-transmission problem.

PR #86474, created minutes later, proposes the obvious and correct transform: convert anyOf: [{const:"a"}, {const:"b"}, {const:"c"}] into {type:"string", enum:["a","b","c"]}. The proof says local Linux x86_64 on Node 22.22 preserved ['a','b','c'], enum length 3, and kept nullable-union tests unaffected. ClawSweeper initially requested preserving nullable: true for multi-const-plus-null unions and adding a regression test; later labels show proof: supplied and proof: sufficient, with 296 of 297 test files passing and one unrelated flaky timeout.

Provider routing is not just price and latency

This bug matters beyond DeepSeek. The industry keeps treating model routing as a spreadsheet: price, latency, context window, benchmark score, maybe tool-use quality. But tool-use quality is not only a property of the model. It is also a property of the adapter that translates your platform’s schema language into the provider’s accepted schema language. If that adapter mutates the tool surface, the comparison is invalid before the model receives the prompt.

A cheaper model that receives a broken schema will look unreliable. A more expensive model that happens to accept the original schema will look smarter. The platform may then conclude that DeepSeek is bad at this workflow, when the real bug is that DeepSeek saw one operation where the plugin defined seven. That is how adapter bugs become model reputation bugs.

This is especially important for OpenAI-compatible providers. Compatibility at the HTTP layer does not mean full compatibility at the tool-schema layer. Providers differ in how they support anyOf, oneOf, nullable values, enums, nested objects, arrays, descriptions, strictness, and validation feedback. A platform like OpenClaw has to normalize. Normalization is not bad. But once a normalizer exists, it must be treated like a compiler pass: input shape, output shape, edge cases, and regression tests. “Seems fine” is not enough when the transform changes what the agent believes tools can do.

How builders should debug this class of failure

If you run OpenClaw with DeepSeek or another OpenAI-compatible provider, audit tools that use Type.Union([Type.Literal(...)] ), especially parameters named mode, action, operation, or strategy. If a model keeps selecting the first option, ignores valid actions, or validation fails for legal values, inspect the normalized schema sent to the provider. Do not stop at the plugin source. The source schema is not the contract the model saw.

For plugin authors, the defensive workaround is to prefer explicit enums where the provider path supports them, or at least test your tools against the normalized provider schema for each model family you claim to support. This is annoying, but it is currently the price of portability. A tool that works perfectly with Claude or OpenAI may fail under DeepSeek because the adapter simplified a construct too aggressively.

For framework maintainers, provider certification needs schema fixtures. Every provider adapter should have tests for const unions, nullable const unions, plain enums, nested object enums, arrays of enums, discriminated unions, optional fields, and rejection behavior. The tests should assert not only that the provider accepts the schema, but that the semantic option set is preserved. In this case, the regression test should fail if seven literals become one constant. That is the invariant users actually care about.

There is also a code-review lesson here. The most valuable review pressure on #86474 was not a broad “make tool calling better” demand. It was narrow and specific: preserve nullable behavior for multi-const-plus-null unions. That is exactly how adapter code should be reviewed. Compatibility transforms fail at edges. Good review names the edge and asks for proof.

The broader take is blunt: agent frameworks are now compilers for tool intent. They translate plugin schemas, policies, memory, channel context, and runtime state into provider-specific prompts and tool declarations. Compiler bugs do not feel like compiler bugs to users. They feel like dumb agents. That is why this one-line DeepSeek schema issue is worth more attention than its size suggests.

Model routing without schema fidelity is not routing. It is roulette with nicer dashboards.

Sources: OpenClaw issue #86468, OpenClaw PR #86474, TypeBox, JSON Schema combining reference