In June 2026, a supply-chain campaign pushed 314 malicious npm packages that specifically targeted the hooks and session-startup mechanisms of Claude Code and Codex. Install a poisoned dependency, and attacker code ran inside your agent's session, exfiltrating AWS credentials, SSH keys, and vault passwords. The agents were not "hacked." They were simply handed the whole computer, and someone walked through the open door.

This is the security story of the year, and the lesson is uncomfortable for the hype cycle: as agents get more capable, the thing that bites you is not capability. It is access. The fix is not a smarter model. It is less privilege.

The core idea Give an agent, and every tool it can call, the narrowest access the task needs and nothing more. Read-only over read-write. One directory over the whole disk. No network unless required. Scoped, short-lived credentials. An audit log of every action. Trust is a property of the boundaries, not of the model.

Why this is happening now

A 2024-era "agent" answered questions. A 2026 agent runs shell commands, installs dependencies, edits files, and chains hundreds of tool calls autonomously. Every one of those powers is also an attack surface. When the agent has broad system access, a single prompt injection, a poisoned package, or an ordinary model mistake has your entire machine as its blast radius.

The industry consensus that formed after the npm attack is blunt: stop giving agents the whole computer. Credential firewalls, outbound-only connections, and task isolation are now table stakes, not nice-to-haves. The interesting work has moved from "what can the agent do" to "what can it not do."

What least-privilege looks like for an MCP tool

Most agent power flows through tools (increasingly, MCP tools). So that is where the boundaries belong. A least-privilege tool has most of these properties:

A worked example: a tool that cannot touch your machine

Concrete beats abstract. My Agent-Context tool gives an agent a dependency map of a codebase (what depends on a file before it is changed). It is also a deliberate least-privilege example, and it now ships as a one-click Claude Desktop extension:

So even if the model went haywire or someone injected a prompt mid-session, the worst this tool can do is read files inside one folder you chose. That is the whole point: the tool is powerful for its job and powerless for everything else.

The production version: a hardened gateway

For tools that genuinely need to act (run code, hit external services), the controls move up a layer into a mediation gateway. The one inside my SDLC engine enforces, on every single call: capability scoping (fail-closed for unknown roles), path-argument boundary enforcement against an allowlisted root, SHA-256 tool-drift pinning (refuse if a server's tool list changed), input and output sanitization with a quarantine path for high-risk responses, per-server circuit breakers and timeouts, a human-approval gate keyed by a stable call id, and an append-only forensic log of every decision. Paired with scoped, short-lived, signed credentials, an agent gets exactly the keys it needs, for exactly as long as it needs them, and every use is on the record.

Read-only
default for tools that don't need to write
One root
not the whole filesystem; path-escape blocked
Audited
every call logged, dangerous ones human-gated

The honest part

Least-privilege is not free. Confinement adds a little friction, capability scoping is more upfront design, and an audit log is one more thing to store. It will not stop a determined attacker who owns your machine already, and it does not replace dependency hygiene (pin and verify what your agent installs). What it does is shrink the blast radius from "everything" to "this one folder, read-only, logged", which is the difference between a fun demo and something you would let near real data or run inside a regulated organization.

That last point is the one that matters commercially. The agents that get adopted in banking, healthcare, and finance will not be the most capable ones. They will be the ones a compliance officer can reason about: constrained, auditable, on-shore. Trusted beats powerful.

The takeaway

The npm attack is a preview, not an anomaly. As agents touch more, "what is it allowed to do" becomes the whole game. Build tools that are powerful for their job and inert for everything else, and put the dangerous capabilities behind a gate with a paper trail. For the broader case that this boring, unglamorous infrastructure is what actually ships, see the boring infrastructure that actually ships; for the read-only tool above, see why your AI keeps breaking code it can't see.

FAQ

Were Claude Code and Codex hacked?
Not the tools themselves. Attackers shipped 314 malicious npm packages targeting the agents' hooks and startup mechanisms, so a poisoned dependency could run code in the agent's session and steal AWS/SSH/vault credentials. It is a supply-chain attack on what the agent is allowed to run.

What is least privilege for AI agents?
Giving the agent and each tool the narrowest access the task needs: read-only over read-write, one directory over the whole disk, no network unless required, scoped short-lived credentials, and an audit log of every action.

Should I give an AI coding agent full filesystem or shell access?
No, not by default. Full access makes the whole machine the blast radius of any injection, poisoned package, or mistake. Confine it to a workspace, prefer read-only tools, block path traversal, and human-gate anything destructive or outbound.

How do I make MCP tools safe to expose?
Read-only where possible (readOnlyHint), confine file access to an allowlisted root, scope which roles call which tool, keep connections outbound-only with secrets scrubbed, pin the tool list against rug-pulls, log every call, and gate writes/deploys behind human approval.

Does a more constrained agent beat a powerful one?
For anything touching real data, yes. Trust comes from the boundaries, not the model. A modest model confined to one read-only folder with an audit log is safer to run, and the only version a regulated org will allow.