agentic-coding

Cline v3.85.0 Turns Model Support Into Runtime Routing Work

Anatoliy Kolodkin

25 May 2026 • 6 min read

Cline v3.85.0 looks like a routine provider-update release if you read it as a changelog. New model IDs, a couple of gateway fixes, a webhook URI, dependency bumps. Fine. Ship it.

But that is the wrong read. The useful story is that coding agents are quietly becoming model routers, and model routing is no longer a dropdown problem. Once an agent is expected to edit files, call tools, preserve reasoning state, price a task, run through enterprise gateways, and keep a multi-turn session coherent, every provider-specific quirk becomes part of the runtime contract. Cline’s May 25 release is interesting because it is almost entirely about that contract.

The release adds GPT-5.5 support through SAP AI Core, DeepSeek V4 Flash and DeepSeek V4 Pro through the direct DeepSeek provider, Gemini 3.5 Flash through Gemini and Vertex, an /lg-task URI webhook integration for LG dashboard flows, a Vertex AI global-endpoint fix for Claude models, next-generation prompts and native tool calling for Poolside Laguna models, and updated diff and protobufjs dependencies. The GitHub Releases API puts v3.85.0 at May 25, 2026, 18:12:51 UTC, with the repo pushed again at 19:42:31 UTC. During research, Cline had roughly 62,301 stars, 6,519 forks, and 946 open issues — not a toy project, and not a place where provider integrations can be hand-waved.

Adding a model is easy. Preserving its semantics is the work.

The DeepSeek change is the cleanest example. PR #11027 adds deepseek-v4-flash and deepseek-v4-pro to Cline’s static DeepSeek registry with context-window, max-output, prompt-cache, reasoning-support, and token-pricing metadata. That sounds like configuration until you hit the reason the PR routes the V4 models through the existing “thinking-message” path used for deepseek-reasoner: DeepSeek reasoning responses can include reasoning_content, and that content may need to be replayed on later tool-call turns.

That one detail is the whole ballgame. A coding agent is not asking a model for one answer and walking away. It is managing a loop: inspect files, reason, call a tool, receive output, update state, edit, test, retry, explain. If the runtime drops provider-specific reasoning metadata between those steps, the next turn may be less coherent or fail in ways that look like “the model got dumb.” The model did not necessarily get dumb. The runtime forgot part of the conversation contract.

PR #11027 also keeps deepseek-chat as the default instead of migrating existing users. That is the right boring decision. Model defaults are production behavior. Changing them because the new option is shinier is how teams get surprise latency, cost, or tool-calling regressions. The PR says validation covered npx biome check, root npx tsc --noEmit, and webview-ui npx tsc --noEmit, while noting that no live DeepSeek smoke test ran because no API key was available. That caveat matters. Static integration and live provider behavior are not the same test.

Enterprise gateways make routing harder, not easier.

The SAP AI Core work is the enterprise version of the same problem. PR #11032 adds gpt-5.5 and nova-core to SAP AI Core. Its review summary registers GPT-5.5 with a 1.05M-token context window and 128K max output, plus image and prompt-cache support. nova-core is listed with 8,192 max output and 128K context.

Those numbers are not trivia. They are routing policy inputs. A team choosing between a cheap fast model, a long-context reasoning model, and an enterprise-routed model needs the agent UI and scheduler to know more than “this model exists.” It needs context size, output ceiling, cache behavior, image support, reasoning behavior, and price. Otherwise the agent will happily send a repo-sized task to the wrong backend or hide a cost cliff behind an innocent model selector.

Cline also strips max_tokens and temperature for reasoning-model payloads in the SAP AI Core path, matching existing GPT-5-series handling. Again: not glamour work, but correctness work. Provider-compatible endpoints are often compatible in the way hotel power adapters are compatible: the plug fits, but you still need to know the voltage. OpenAI-compatible routes, SAP’s gateway conventions, and reasoning-model payload rules all have to line up. If they do not, users experience it as a flaky agent.

The Vertex fix for Claude models points in the same direction. Many teams route model traffic through Vertex not because developers enjoy extra indirection, but because procurement, billing, auditability, region policy, or platform governance already live there. A broken global endpoint is not a cosmetic issue for those teams. It means the approved control plane cannot run the model the team selected. In enterprise agent deployments, provider routing is part technical integration, part compliance boundary.

The “best coding agent” comparison is becoming an operations review.

This is why the usual benchmark framing around coding agents is getting stale. Benchmarks can tell you something about raw model capability. They do not tell you whether the runtime preserves reasoning metadata across tool calls, whether it exposes token pricing accurately, whether cloud-hosted provider variants behave like direct APIs, whether OAuth and regional endpoint choices survive real enterprise networks, or whether the error message tells an engineer what actually broke.

The Reddit-adjacent signal from the research brief is small but useful: a practitioner complaint in r/openclaw described an “unknown model” error for DeepSeek V4 Flash even though the model appeared in a list and V3.2 worked. That was not a Cline thread, but it captures the failure mode perfectly. Model support is not a list. A model has to exist in the UI, map to the provider’s exact ID, use the right request path, preserve the right response fields, apply the right tool-calling format, and fail with an error that names the broken boundary.

Poolside Laguna moving to next-generation prompts and native tool calling is another tell. Agent runtimes are migrating away from generic prompt wrappers wherever providers expose native capabilities. That can improve reliability: native tool schemas are usually easier to validate than “please call this tool in JSON” prompt folklore. But it also multiplies the integration surface. Each provider-native path needs its own prompt template, schema handling, continuation behavior, retry logic, and logging. This is where agent products either grow up or become a pile of special cases nobody wants to debug.

The /lg-task URI webhook integration is less central, but it fits the operating-surface pattern. Coding agents are being wired into dashboards, task systems, local clients, hosted runners, and review workflows. The agent is no longer just a chat pane in an editor. It is a participant in a broader workflow graph. Once that happens, the runtime has to be explicit about handoffs: which task launched the session, which model route handled it, which tools ran, what it cost, and what state must be replayed if the session resumes elsewhere.

What engineers should do before flipping the switch

If you use Cline, v3.85.0 is worth testing, but do not treat a provider bump like a theme update. Pick a small repo and run a controlled task that requires at least one tool call, one file edit, one retry, and one follow-up question. Watch whether DeepSeek V4 reasoning/tool-call state behaves correctly across turns. If you have access to SAP AI Core, verify GPT-5.5 through that route with a long-context task and confirm the UI reflects the expected context, output, cache, and image-support metadata. If your team routes Claude through Vertex, test the fixed global endpoint in the region and project configuration you actually use.

Also inspect cost and failure behavior. Does the model selector show enough information for developers to choose intelligently? Do errors distinguish an unknown model ID from a provider route failure, an auth failure, a payload-format problem, or an unsupported tool-call mode? Can you capture the provider ID, route, payload family, and session ID in a bug report without scraping console logs? These questions sound pedestrian. They are the difference between an agent stack you can operate and an agent stack you merely hope works.

The practical policy is simple: separate “model is available” from “model is approved for autonomous coding work.” A new model can be allowed for chat or low-risk edits before it is trusted for repo-wide refactors, dependency upgrades, or production incident fixes. For higher-autonomy use, require a smoke test suite that includes tool calls, large-context input, retry/resume behavior, pricing display, and at least one intentional provider error. If your team is serious about agentic coding, model onboarding should look more like adding a new payment processor than adding a new font.

Cline v3.85.0 is incremental, but it is incremental in the direction that matters. The coding-agent race is not just about who lists the newest frontier model first. It is about who routes models correctly, preserves their semantics, exposes their costs and limits, respects enterprise gateways, and gives engineers enough surface area to debug the inevitable weirdness. A dropdown can advertise support. A runtime has to earn it.

Sources: Cline v3.85.0 release, PR #11027 — DeepSeek V4 models, PR #11032 — GPT-5.5 / SAP AI Core provider support, Cline repository.

Adding a model is easy. Preserving its semantics is the work.

Enterprise gateways make routing harder, not easier.

The “best coding agent” comparison is becoming an operations review.

What engineers should do before flipping the switch

Sign up for more like this.