Agentic Coding: Best Practices

Agentic Coding: Best Practices - THE LGTM

Agentic Coding: Best Practices

Agentic coding without guardrails is like a race car without brakes: fast, dangerous, and eventually catastrophic. Here's how to move up the autonomy ladder without lowering your quality bar.

Last Updated: April 5, 2026

The Core Principle

As we move toward more autonomous AI, more asynchronous agent loops, more "come back later and review the result" styles of working, we need a proportional increase in guardrails. Not vibes. Not hopeful prompting. Actual checks. Deterministic constraints. Tight feedback loops.

The higher you climb on the agentic ladder, the harder you can crash. Guardrails are what let you go fast without going off the track.

Velocity, Not Just Speed

Speed lacks direction. Velocity is a vector.

Today's LLMs in engineering work:

  • Can produce a lot of code quickly ✓
  • Can't reliably maintain direction over time ✗
  • Will confidently drift, invent non-existing methods, misinterpret constraints ✗
  • Don't feel the cost of bad decisions ✗

Our job is steering: providing intent, constraints, architecture, and feedback. But when scaling autonomy, steering isn't enough. We need guardrails: boundaries the agent hits by design when it starts drifting, forcing correction without us babysitting every slip-up.

Guardrail 1: Real Continuous Integration

This isn't "we run unit tests on feature branches." Real CI means continuously integrating work from multiple developers (or agents!) working in parallel. Merging to main/trunk frequently so divergence stays small.

Before AI-assisted workflows, "feature branches should live 2-3 days max" was common. With agentic tooling, that's way too slow. The volume of change is higher, parallelism is higher.

Today that means:

  • Git worktrees and branches measured in hours, not days
  • "Integrate first" mindset
  • Slicing work much smaller than average "spec" or "PRD"
  • Using trunk-based development, parallel change, feature toggles deliberately

If you don't have this, the rest of the guardrails are duct tape on a cracked foundation.

Guardrail 2: Static Typing as First Defense

In a world where imperfect code is available for free, a compiler is your first line of defence. But static typing isn't enough — you need to actually leverage the type system.

The Trap: Programming in Primitives

A typical enterprise codebase is full of string, int, bool. Technically types. Practically, they don't encode your domain.

Classic failure mode:

upload(blobUri: string, documentName: string, ...)

An agent generates the call. It compiles. It looks fine. But it swaps documentName and blobUri. The compiler won't save you because both are strings.

The Fix: Domain Types

upload(blobUri: BlobUri, documentName: DocumentName, ...)

Now if the agent swaps them, the code won't compile. That's cheaper than writing a unit test.

Benefits:

  • Silly mistakes become compile errors
  • Reviews become easier — no mental simulation of what a string represents
  • Codebase becomes self-documenting
  • Fewer tests needed for structural correctness

Guardrail 3: Linters and Deterministic Tools

Key principle: If there's a deterministic tool for the job, don't "prompt" the model to do the tool's work.

We have formatters, linters, static analyzers, vulnerability scanners, dependency checkers. They're better at their domains than an LLM approximating them in text.

Make Guardrails Cheap: Run on the Diff

Your agent loop should focus on validating changes, not re-validating the entire world:

  • Determine staged/changed files
  • Run lints/tests/analyzers only on those files
  • Fail fast with precise feedback

The loop needs to be tight: agent changes → tools validate → agent fixes → repeat until green → then you review.

Reduce Token Burn: Filter Tool Output

Tool output is optimized for humans. Agents ingest everything. Wrap your tools:

  • If tests pass: output "OK", exit code 0
  • If tests fail: output error summary and stack traces
  • Strip verbose logs unless they matter

This makes agent loops cheaper and corrections sharper.

Shift Left: Pull Feedback into the Agent Loop

Put guardrails inside the agentic loop, not after PR submission. Everything that can be automated before human eyeballs meet the artifact should be automated.

Guardrail 4: Automated Architecture Checks

Don't use diagrams to validate constraints that can be checked automatically.

If you have rules like:

  • "UI must not depend on data layer"
  • "Domain core must not reference database adapters"
  • "No cross-module access except through defined interfaces"
  • "We enforce hexagonal layering"

Encode them. Tools like ArchUnit (and equivalents) let you express architectural constraints as readable rules that run as part of your test suite.

This yields: checks running inside the agentic loop, violations becoming immediate blocking failures, less time policing structure in review.

Guardrail 5: High-Quality Automated Tests

A common trap: "Generate unit tests. Aim for 100% coverage. We can always regenerate them."

That's wrong. If tests change every time you refactor, they're not guarding behavior — they're guarding your current implementation. They become maintenance tax, not safety net.

The Rule: Tests Should Survive Refactoring

Tests should change when behavior changes, not when code structure changes. Prefer tests that encode scenarios/use cases over tests that mirror internals.

Think in Scenarios, Not Methods

❌ "Should I test this class?"
❌ "Should I test this method?"

✅ "What are the distinct scenarios?"
✅ "What different acceptance criteria matter?"
✅ "What behaviors must remain true?"

Write (or have the agent write) a small number of high-quality tests validating behaviors through public APIs.

Guardrail 6: Vulnerability Scanners

Enterprise teams have used SonarQube, CodeScene for years. In a world where AI generates insecure patterns quickly, these move from "nice-to-have" to "you really want this."

MCP Servers and Agentic Access

What gets interesting: when these tools expose MCP servers allowing agents to query them directly. Now an agentic loop can:

  1. Change code
  2. Run tests
  3. Run code health analysis
  4. Read reported issues
  5. Refactor
  6. Re-run analysis until improved

You're turning "quality tooling" into part of the closed-loop system.

The Meta-Pattern: Grow Incrementally

Don't design the perfect pipeline up front. A better approach is incremental and diagnostic:

  • Where are you burning tokens today?
  • Where does quality fail most often?
  • Dealing with regressions? → Invest in tests
  • Dealing with insecure code? → Invest in scanning
  • Dealing with architecture drift? → Invest in dependency checks
  • Dealing with messy diffs? → Invest in formatting/linting

Guardrails compound. Each one becomes reusable across features, repos, teams. You're building a library of deterministic skills and feedback loops.

An Underused Superpower: Hooks

Many agentic tools now support "hooks": events in the agent lifecycle where you attach custom scripts:

  • Session starts/ends
  • Before/after tool calls
  • Before writing files
  • Before committing

Hooks are powerful because they're deterministic. This isn't "prompt the agent to remember to run tests." It's "tests run because the workflow requires it."

Enforce with hooks:

  • "Don't write files unless formatting passes"
  • "Don't mark task done unless all tests pass"
  • "Always run dependency checks after feature development"
  • "Always run security scan before creating PR"

Combine hooks with git worktrees, sub-agents, and multi-step workflows to scale autonomy without relying on an LLM's good intentions.

The Takeaway

Agentic coding today is unforgiving. Without guardrails, higher autonomy amplifies the cost of mistakes.

The practical rule:

More autonomy, more constraints, tighter loops, more deterministic validation.

Focus on guardrails that:

  • ✅ Are deterministic (tools over prompts)
  • ✅ Run on the diff (cheap feedback)
  • ✅ Reduce noise (signal-rich output)
  • ✅ Shift left (inside the agent loop)
  • ✅ Enforce structure (types + architecture checks)
  • ✅ Protect behavior (high-quality automated tests)
  • ✅ Catch security/quality issues early (scanners)
  • ✅ Are non-optional (hooks)

That's how you go faster without lowering standards, and without turning "productivity" into future cleanup work disguised as progress.

Further Reading