Under the Hood: Security Architecture of GitHub Agentic Workflows

Under the Hood: Security Architecture of GitHub Agentic Workflows

Running an AI agent on a developer laptop is one problem. Running one inside a CI/CD pipeline — where it consumes untrusted inputs from issues and pull requests, reasons over repository state, and makes consequential decisions without a human watching in real time — is a fundamentally different threat model. GitHub's engineering team has published a detailed look at how they're solving it inside GitHub Agentic Workflows, and the architecture decisions are directly applicable to any team designing their own agentic infrastructure.

The core problem GitHub calls the "trust-domain" problem: agents must process content from the open internet and from arbitrary contributors, then act on that content in privileged environments. Their solution is layered. Isolated execution sandboxes separate the agent runtime from secrets and MCP servers. Constrained output channels ensure agents can only write to scoped commit targets, not arbitrary files or environment variables. Granular permission policies are defined at the workflow level, not the agent level — so the blast radius of any single agent is bounded by design. And immutable audit logs capture every agent decision, enabling post-hoc review of anything the agent did and why.

The post also covers prompt injection mitigations and walks through a threat model for evaluating whether a given workflow is safe to automate. As coding agents move deeper into production infrastructure, this kind of sandboxing and trust-boundary thinking stops being optional. GitHub's public breakdown is currently the most detailed available from any major platform, and it's the right starting point for any security review of an agentic pipeline.

Read the full article at GitHub Blog →