Anthropic Releases Safer Claude Code 'Auto Mode' to Avoid Mass File Deletions and Other AI Snafus

Anthropic Releases Safer Claude Code 'Auto Mode' to Avoid Mass File Deletions and Other AI Snafus

Anthropic's autonomous safety layer for Claude Code just got its most visible public rollout yet. Auto mode, now in research preview on the Team plan, places an AI classifier in front of every tool call — evaluating each action against a safety policy before execution proceeds. Mass file deletions, sensitive data exfiltration, malicious code execution, and prompt injection are among the categories that trigger a block. Safe operations go straight through; risky ones are rejected and Claude reroutes to an alternative approach. If Claude keeps running into walls, it eventually surfaces a human approval prompt rather than grinding to a halt.

Engadget's coverage frames it well: this is no longer just a developer story. The "avoid mass file deletions" headline speaks directly to teams that have been burned by runaway agentic AI loops, and that framing is likely to accelerate enterprise adoption conversations. The unresolved tension is transparency — Anthropic hasn't published the classifier's exact ruleset or its false-positive and false-negative rates, which developers will need to reason about before deploying Claude Code in high-stakes autonomous workflows. Enterprise and API rollout is imminent.

Read the full article at Engadget →