xai

Grok Build’s Enterprise Docs Make the Coding Agent Race About Policy, Not Just Prompts

Anatoliy Kolodkin

29 May 2026 • 5 min read

Enterprise coding agents are not going to win because they autocomplete faster. They are going to win because security teams can let them touch real repositories without turning every laptop into an unreviewed production system. That is why xAI’s refreshed Grok Build enterprise deployment docs matter: they move the story from “Grok has a CLI” to “Grok has a policy surface.”

That sounds boring. Good. Boring is what happens when a powerful tool graduates from demo theater into managed infrastructure. A coding agent that can read source, edit files, run shell commands, call MCP tools, operate headlessly, and inherit repo instructions is not an assistant in the old chat-window sense. It is an actor on the developer workstation. Actors need identity, policy, sandboxing, auditability, network boundaries, and an administrator with the authority to say no.

xAI’s docs now describe that deployment model in unusually concrete terms. Core Grok Build functionality requires allowlisting cli-chat-proxy.grok.com for inference and settings plus auth.x.ai for OAuth2/OIDC authentication. Enterprise OIDC adds the customer’s identity-provider domain — for example login.microsoftonline.com. Optional hosts cover direct API-key usage, remote session sync, UI assets, binary downloads, and fallback CDN delivery. All connections use HTTPS on port 443 with TLS 1.2 or 1.3 enforced by rustls, and there is no “disable TLS” escape hatch. Proxy support uses the standard HTTPS_PROXY, HTTP_PROXY, and NO_PROXY variables, with xAI warning that SSE inference chunks can idle for 600 seconds, so enterprise proxies need at least ten-minute idle timeouts.

The real product is the config stack

The most important detail is Grok Build’s five-layer configuration model. From lowest to highest priority, it loads /etc/grok/managed_config.toml, ~/.grok/managed_config.toml, ~/.grok/config.toml, ~/.grok/requirements.toml, and finally /etc/grok/requirements.toml. Settings in requirements.toml are pinned: they cannot be overridden by user config, environment variables, remote settings, or lower-priority layers. xAI explicitly recommends the system-level requirements file for MDM, golden images, compliance-critical policies, sandbox enforcement, tool restrictions, telemetry posture, and feature flags.

That is the shape enterprise buyers need. A local coding agent cannot be governed only by repo conventions and developer preference. If a user can flip a setting because the agent wants a smoother onboarding flow, the policy is decoration. A fail-closed system-level requirements file gives platform teams somewhere to encode the decisions that should survive enthusiasm: which tools are allowed, which sandbox profile is mandatory, whether marketplaces are restricted, and whether unattended runs must deny everything not explicitly permitted.

xAI also makes a compatibility play that deserves both credit and skepticism. Grok can read a subset of Claude Code managed settings, including permission rules, MCP server allowlists, telemetry and feedback flags, and marketplace restrictions. For mixed-agent organizations, that reduces migration friction. If a company has already pushed Claude policies through MDM, Grok can inherit some of the operational work rather than forcing a greenfield policy rollout.

But compatibility is not compliance. A rule reviewed for one runtime may not mean the same thing inside another runtime with different tool names, sandbox semantics, approval modes, network behavior, and model behavior. Treat imported Claude settings as scaffolding, not proof. The right rollout is to inspect Grok’s effective policy, diff it against the intended policy, and test dangerous workflows before anyone declares parity. Agent configuration is becoming a supply chain. “It reads the same file” is not the same as “it has the same blast radius.”

Sandboxing is useful, but OS details still matter

Grok Build’s sandbox profiles are practical: off, workspace, devbox, read-only, and strict. Linux uses Landlock; macOS uses Seatbelt. The workspace profile can read broadly but write only to the current working directory, /tmp, and ~/.grok/. Strict mode limits reads to the current working directory and system paths, while allowing writes only to the working directory, temporary storage, and Grok’s own state. Sensitive directories — ~/.ssh, ~/.gnupg, ~/.grok/auth, ~/.aws, ~/.config/gcloud, and ~/.azure — are always write-protected regardless of profile.

The catch is exactly the sort of catch practitioners need to notice: child-process network blocking in read-only and strict uses Linux seccomp BPF and is not currently enforced on macOS. That does not make the sandbox useless. It does mean “strict” is not a single cross-platform guarantee. If you are reviewing untrusted repositories, running headless agents in CI, or testing workflows with secrets nearby, a managed Linux devbox may be the higher-assurance control point than a fleet of developer Macs with uneven local state.

The permission model sits beside the sandbox rather than replacing it. Grok checks PreToolUse hooks, policy rules, built-in fast paths, and then prompt policy. Deny rules beat allow rules. Fine-grained filters can target Bash(git *), Edit(**/*.rs), Read, Grep, MCPTool(my-server__*), and WebFetch. Enterprise modes include dontAsk, which silently denies anything without an explicit allow rule, and acceptEdits, which auto-approves file edits but prompts for shell commands.

That separation matters. Permissions decide what the model is allowed to request. The sandbox limits what the process can actually do even if a request is approved. For headless jobs, CI review bots, and high-security repos, dontAsk plus narrow allow rules plus --sandbox strict is the sane starting posture. always-approve remains the footgun: useful in constrained automation, reckless on a broad workstation unless explicit deny rules and a real sandbox are already in place. xAI’s docs note that dangerous commands such as rm, chmod, chown, kill, and git push always prompt in ask mode, but are auto-approved in always-approve unless explicitly denied. That sentence should be printed on the rollout checklist.

Identity and data retention are now deployment decisions

Authentication is similarly enterprise-shaped. Grok Build supports browser OIDC, device-code login for SSH and containers, external auth-provider commands for corporate token brokers, and API keys for CI/CD. Enterprise OIDC supports providers such as Entra ID, Okta, and Auth0 using PKCE and refresh-token grants. Credential resolution is explicit: model-specific API key, model environment key, active session token, then XAI_API_KEY. External auth commands can return a bare token or JSON with access_token, optional refresh_token, and optional expires_in; background refresh has a ten-second timeout and kills hanging commands.

For data lifecycle, xAI describes six phases: local input assembly, TLS transport, inference proxy/model call, local tool execution, streamed response, and session end. For zero-data-retention organizations, xAI says prompts, code, and responses are not persisted at the inference layer; local session history still remains under ~/.grok/. That distinction is important. ZDR reduces provider-side retention risk. It does not eliminate local endpoint governance, session-history handling, repo-level secrets hygiene, or logging questions.

The practitioner move is straightforward. Before piloting Grok Build, define managed policy centrally. Separate interactive developer usage from CI and headless usage. Pin sandbox profiles by environment. Use dontAsk for unattended work. Deny broad shell execution by default. Whitelist MCP servers by name, not vibes. Add explicit denials for destructive commands if any auto-approval mode is allowed. Test proxy behavior with long SSE responses. Document which data is local, which data is sent to xAI, which data is retained under ZDR, and where local session history lives.

The coding-agent race is usually narrated as a model contest: Claude vs. Codex vs. Grok vs. whatever shows up next week with a benchmark chart and a confident launch post. That misses the enterprise buyer’s actual question. The question is not “can this agent write a decent patch?” Increasingly, several can. The question is “can we govern it like infrastructure?” xAI’s enterprise docs are a credible step toward that answer. Model quality gets the demo. Managed config, sandboxing, identity, and permission rules get the deployment.

Sources: xAI Docs — Enterprise Deployments, Claude Code settings docs, OpenAI Codex CLI docs, xAI Grok Build headless scripting docs

The real product is the config stack

Sandboxing is useful, but OS details still matter

Identity and data retention are now deployment decisions

Sign up for more like this.