openclaw

OpenClaw 5.12’s Xiaomi MiMo Regression Is a Compatibility Warning for Every OpenAI-Compatible Agent Stack

Anatoliy Kolodkin

14 May 2026 • 5 min read

“OpenAI-compatible” is one of those labels that sounds like an engineering contract and behaves more like a weather forecast. Useful, directional, and very capable of ruining your day if you treat it as precise.

OpenClaw issue #81969, opened on May 15 at 00:47 UTC and closed ten minutes later as already fixed on current main, is a small compatibility regression with a large warning label. A user running OpenClaw 2026.5.12 against Xiaomi MiMo models through an OpenAI-compatible completions endpoint found that requests with tools started failing with 400 Param Incorrect. The same payload worked when two fields were removed from the tool definition: strict: true and additionalProperties: false.

That is the whole bug in one sentence. It is also the whole portability problem in one sentence. OpenAI-compatible providers may implement enough of /v1/chat/completions to satisfy ordinary chat clients, but agent runtimes are not ordinary chat clients. They push function schemas, streaming usage, long histories, reasoning fields, fallback chains, replay behavior, and tool-choice semantics. Compatibility layers rot at those edges first.

Strict tools are good. Global strict tools are brittle.

The reported environment is specific: OpenClaw 2026.5.12 at commit f066dd2, provider base URL https://token-plan-cn.xiaomimimo.com/v1, API type openai-completions, and MiMo models mimo-v2.5 and mimo-v2.5-pro. The user’s custom config listed large context windows — 262144 tokens for mimo-v2.5 and 1048576 for mimo-v2.5-pro — with maximum tokens of 8192 and 32000, respectively. Both were marked as reasoning-capable.

The reporter isolated the failure well. stream: true worked. stream_options.include_usage worked. store: false, reasoning_effort: "medium", tool_choice: "auto", content-array format, system prompts, tools, thinking, and streaming all worked in isolation or combinations. What broke the endpoint was OpenClaw’s tool conversion path adding OpenAI-style strictness to function definitions. A curl request with strict: true failed with an empty response, connection termination, or 400 Param Incorrect. The same request without strict: true and without additionalProperties: false succeeded.

None of this means strict tool schemas are bad. They are often the right move. Strict schemas reduce ambiguity, constrain tool-call arguments, make validation failures earlier and clearer, and give model outputs a narrower lane. In agent systems, narrower lanes are usually security wins. A tool that accepts only the fields it expects is easier to audit than a tool that accepts arbitrary JSON and hopes the model behaves.

The mistake is applying a native OpenAI contract globally just because a provider speaks an OpenAI-shaped API. Native OpenAI can get native OpenAI semantics. A custom base URL should not inherit every new OpenAI-specific schema convention unless it has declared support for that convention or the operator has opted in. The correct boundary is provider capability, not API family branding.

The fix should be capability-based, not Xiaomi-shaped

ClawSweeper closed #81969 as already implemented on current main, saying OpenClaw now classifies non-native custom base URLs as proxy/custom routes and omits function.strict for those tool payloads. The recommendation was to keep route-based strict-tool gating rather than add a Xiaomi-specific special case.

That is the right instinct. Vendor-specific hacks are tempting because they make the current bug go away. They also turn the provider layer into a cabinet of handwritten exceptions that nobody trusts six months later. A route- and capability-based model is more honest: OpenAI-native routes can use OpenAI-native strict tool behavior; custom/proxy routes stay on the compatibility path unless they explicitly prove support for stricter semantics. The platform should know the difference between “this endpoint has the same URL shape” and “this endpoint accepts the same evolving function-calling contract.”

This matters beyond MiMo. The same class of bug will show up across Qwen, DeepSeek, Ollama-fronted OpenAI-compatible servers, vLLM, LM Studio, private inference gateways, and enterprise model routers. Some will accept strict tools but reject a reasoning field. Some will stream token usage differently. Some will preserve reasoning traces under one key and hang if the runtime strips them. Some will claim tool support and then validate schemas with a subtly different JSON Schema subset. Agent runtimes hit all of those paths because they are orchestration systems, not chat boxes.

OpenClaw already has another MiMo-adjacent signal here. Related PR #81951, opened May 14, preserves reasoning_content for Xiaomi MiMo models during OpenAI-compatible replay because MiMo multi-turn conversations can hang with LLM idle timeout (120s) when reasoning content is stripped. Related issue #81923 describes a broader 5.12 custom-provider breakage pattern: schema rejection on a token-plan MiMo provider, followed by fallback failure with an Ollama auth regression, producing a misleading “missing API key” user-facing error.

That last detail is the part operators will feel. The primary failure was provider schema compatibility. The fallback then hit a separate auth-resolution problem. The final visible error pointed at a missing API key. That is a debugging trap. Once a runtime supports provider fallback, it needs to preserve the causal chain: primary failed because strict tool schema was rejected; fallback failed because auth could not resolve; final response should tell the operator both. Collapsing the cascade into an auth complaint sends people fixing the wrong layer.

What practitioners should do before the next upgrade

If you run custom OpenAI-compatible providers, the action item is not “avoid strict tools forever.” It is to build compatibility fixtures. Test the exact behaviors your agent runtime depends on: tool calls with strict and non-strict schemas, nested object parameters, additionalProperties behavior, streaming usage chunks, multi-turn replay with reasoning fields preserved and stripped, fallback ordering, and final error reporting. Do it before upgrading the agent platform, not after your assistant starts blaming API keys.

For local and private coding-agent stacks, this should become part of release qualification. A provider that works for “hello world” chat may still fail when the agent needs to call a tool, recover from a timeout, replay a reasoning trace, or fall back to a secondary backend. The more heterogeneous the model fleet, the more important these tests become. Local/private agent stacks are attractive precisely because they let teams mix providers, models, and gateways. That flexibility is real value. It also means compatibility cannot be assumed from the presence of an OpenAI-shaped endpoint.

For platform builders, the design lesson is capability discovery plus conservative defaults. Provider manifests should describe support for strict function schemas, reasoning replay fields, streaming usage, content-array formats, tool-choice modes, and error shapes. When that information is missing, the runtime should choose the least surprising compatibility mode and make the tradeoff observable. If an operator opts into strict tools on a custom route, log it. If the runtime downgrades strictness for compatibility, log that too. Silent mutation is how portability bugs become support tickets.

The editorial take: OpenAI-compatible provider support is not a boolean. It is a matrix of tool schema, streaming, reasoning replay, auth, fallback, and diagnostics behavior. OpenClaw’s MiMo regression is small and apparently already fixed on main, but the warning is not small: agent runtimes sit where API compatibility claims meet the weird parts of real execution. That is exactly where “compatible” stops being enough.

Sources: OpenClaw issue #81969, OpenClaw issue #81923, OpenClaw PR #81951, OpenClaw v2026.5.12, OpenAI function calling docs

Strict tools are good. Global strict tools are brittle.

The fix should be capability-based, not Xiaomi-shaped

What practitioners should do before the next upgrade

Sign up for more like this.