agentic-coding

Coding Agents Do Not Need More Speed. They Need a Verification Loop

Anatoliy Kolodkin

26 May 2026 • 4 min read

The uncomfortable truth about AI coding agents is that speed stopped being the impressive part. The demos already proved they can generate code faster than a human can review it. The unresolved question is whether a team can merge that code without turning software delivery into a trust fall with stack traces.

That is why Sonar’s Agent Centric Development Cycle, covered today by The New Stack, is more interesting than its acronym deserves. AC/DC — Guide, Generate, Verify, Solve — is not a new model, a new editor, or another “your developers will be 10x by Q3” pitch. It is a workflow argument: once agents can produce large change sets asynchronously, the engineering system around them has to produce evidence just as quickly.

The framework puts generation in the middle, but the useful work happens around it. “Guide” means agents need architecture rules, quality profiles, project conventions, security constraints, and business context before they touch the repo. “Verify” means deterministic checks run both while the agent is working and after it thinks it is done. “Solve” means findings feed remediation instead of becoming yet another dashboard nobody opens after sprint planning.

The bottleneck moved from writing code to proving code

The numbers behind the pitch are the real story. Sonar’s 2026 survey says 72% of developers who have tried AI use it daily, AI already accounts for 42% of committed code, and developers expect that share to reach 65% by 2027. The same survey reports that 64% of developers have started using autonomous AI agents and that the average team now juggles four AI coding tools.

That sounds like adoption momentum until you read the verification side of the ledger: 96% of developers do not fully trust AI-generated code, only 48% always check AI-assisted code before committing, and 38% say reviewing AI-generated code takes more effort than reviewing human-written code. That is not a productivity curve. That is a review crisis with a better marketing department.

Traditional CI was designed around human-shaped increments. A developer writes a patch, usually with some local knowledge, and automation checks the obvious failure modes. Agentic coding often flips the shape: a model reasons for a while, edits across multiple files, produces a patch that may be larger than a human would have attempted in one sitting, and leaves the reviewer to reconstruct intent after the fact. The failure mode is not simply “bad code.” It is code that arrives without enough provenance to review efficiently.

AC/DC is valuable because it treats that provenance as part of the delivery system. A serious agent-produced change should carry a plan, assumptions, files touched, tests run, static-analysis results, security findings, known uncertainty, and links to the original issue or production signal. If the agent cannot produce that evidence, it has not finished the job. It has merely created a diff.

Clean code is now an agent infrastructure optimization

The sleeper detail in the research is Sonar’s controlled study of matched repository pairs. Agents operating in higher-quality codebases used about 7% fewer input tokens, 8% fewer output tokens, 11% less reasoning effort, and re-read files 34% less often. That is the sort of statistic engineering leaders should care about because it turns “maintainability” from a moral preference into an operating cost.

Clean code was already easier for humans to change. Now it is also cheaper and less confusing for agents to navigate. A repo with consistent naming, clear boundaries, low duplication, and useful tests is not just nicer. It reduces context churn. It gives the model fewer branches to hallucinate through. It makes retrieval more precise. It lowers the probability that a generated fix lands three directories away from the real abstraction.

The reverse is also true. If teams let AI-generated code accumulate complexity, every future agent run starts from a worse map. The model spends more tokens re-reading files, more reasoning effort inferring intent, and more tool calls exploring dead ends. Technical debt becomes agent tax. The invoice arrives as latency, spend, flaky patches, and reviewer fatigue.

This is an under-discussed consequence of agent adoption. Teams are not only deciding whether to use AI coding tools. They are deciding whether their codebase is legible enough for those tools to operate safely. If the answer is no, buying more agent seats may simply automate confusion.

Vibe coding is fine. Vibe merging is not.

The current argument around “vibe coding” is usually framed as taste: real engineers versus prompt jockeys, craft versus chaos, discipline versus speed. That framing is too cute. The problem is not using AI by feel during exploration. Good developers have always sketched, spiked, and thrown code away. The problem is when exploratory output crosses into production without a verification loop strong enough to compensate for how it was created.

For individual engineers, the practical workflow is straightforward. Do not ask an agent to “fix the bug” and then review a cold diff. Ask it to explain the failure, propose a plan, list the files it expects to touch, identify the smallest meaningful tests, and state what evidence would convince it the fix worked. Then make the agent run those checks before it opens the patch. Review the result against the plan, not against vibes.

For teams, the bar should be higher. Approved agent contexts should be documented. Generated code should run through the same or stricter gates as human code. Agent runs should be attributable by tool, model, repo, user, and outcome. Prompt traces and tool logs should survive long enough to debug incidents. Repeated verification failures should update the guidance given to agents, not become folklore in code review comments.

The most important shift is cultural: agents should not be measured only by lines generated or tickets closed. They should be measured by mergeable work with evidence. A patch that takes ten seconds to write and two hours to understand is not productivity. It is deferred labor with syntax highlighting.

AC/DC is not perfect, and the name will make at least one architect put their head in their hands. But the underlying point is correct. The teams that win with coding agents will not be the ones producing the most code. They will be the ones that make generated code boring to approve: guided by project rules, checked by deterministic systems, explained in reviewable artifacts, and remediated in a closed loop.

That is less glamorous than a model launch. It is also what separates agentic development from automated technical debt.

Sources: The New Stack, Sonar, Sonar verification-gap survey, Sonar LLM code-quality leaderboard

The bottleneck moved from writing code to proving code

Clean code is now an agent infrastructure optimization

Vibe coding is fine. Vibe merging is not.

Sign up for more like this.