May 26, 2026Sai4 min read

Workload attestation and the confused deputy

The confused deputy problem is a 1988 paper. Agent systems brought it back. SPIFFE-style attestation is one of the cleaner ways out.

Norm Hardy's 1988 paper "The Confused Deputy" is one of those short pieces of computer-science writing that's still load-bearing forty years later. The setup is simple: a program with elevated authority is asked to perform an action on behalf of a less-privileged caller, and can't tell which authority should apply — its own, or the caller's. The classic example was a compiler with write access to a system file: an attacker told the compiler to write debugging output to that path, and the compiler did it, using its own permissions instead of the attacker's.

The agent ecosystem reinvented this problem.

The agent confused deputy

The shape:

Your agent has broad authority — it was minted with a credential that scopes to "anything a release engineer might need to do."
A user — or an upstream agent, or a prompt-injected tool description — asks it to do something.
The agent does it. The downstream system sees the agent's credential, sees a valid action, allows it.

The downstream system has no way to tell whether the action originated from a legitimate request or from a prompt injection. The credential is the agent's; the intent came from somewhere else.

This isn't a new vulnerability class. It's the confused deputy with a chat interface.

Why the standard fixes don't quite work

The classic remediation for confused deputy is capability-based security: rather than the deputy holding its own ambient authority, the caller passes the deputy a specific capability for the specific action. The deputy can't do anything the caller didn't explicitly grant.

This is a beautiful model. It's also nearly impossible to retrofit onto an ecosystem built on ambient OAuth credentials. Every existing API expects a bearer token, not a capability.

The pragmatic move is to make the credential itself carry the caller's identity — so the downstream system can see both "this is an authorized agent" and "this is being done on behalf of Alice for this task." RFC 8693's act claim (§4.1) is exactly this: the chain is in the token. Combined with audience-binding (RFC 8707), the credential is no longer ambient — it's narrow enough that the downstream system can make a real authorization decision.

Where attestation comes in

There's still a missing piece. Even if the credential carries the caller's identity, what's actually holding the credential matters. If a coding-agent runtime is compromised — a supply-chain attack on a Python package, say — the attacker now has a process running with the agent's credentials. The credential will look fine. The attestation won't.

This is what SPIFFE and its reference implementation SPIRE are for. SPIFFE attests workload identity at the OS level: the SPIRE agent on a node sees which process is asking for an identity, checks selectors (UID, container image hash, Kubernetes service account, namespace, EC2 instance metadata, GCP instance identity token, etc.), and either issues a credential or doesn't.

The resulting SVID — SPIFFE Verifiable Identity Document — is bound to that workload. A different process on the same host gets a different SVID. A process whose binary has changed (different hash) doesn't get one at all without explicit re-attestation.

Combined with composite identity, the chain becomes:

Who delegated this — from the user's IdP token.
What's acting — from the agent registry, signed at deploy time.
Where it's running — from SPIRE attestation, signed at process start.

When something goes wrong, the chain answers the forensic question. When something's about to go wrong, the policy engine can refuse the call because one of the three signals doesn't match.

What this buys you against prompt injection

It doesn't fix prompt injection. Nothing at the credential layer does — prompt injection happens above the credential layer, inside the model's reasoning.

What it does is bound the damage. If a prompt injection convinces the agent to call delete-everything, the credential the call is made with reflects:

which user authorized the agent (so the call is attributable, not anonymous),
which agent build is running (so a forensic team can examine that specific binary's prompts and tools),
which workload is running it (so a compromised replica can be isolated),
what the agent was bound to do (so a policy engine can say "no, the bound task was deploy-staging, not delete-everything").

None of those by themselves prevent the attack. Together, they shrink the blast radius and make the post-incident work tractable. The confused deputy gets a name tag, a job description, and a supervisor.

The minimum honest version

If you can't deploy full SPIFFE/SPIRE (most teams can't, today), the minimum honest version is:

Don't share credentials across agents. Even if "agent" is a loose abstraction in your runtime, each instance should hold its own credential.
Bind credentials to a specific task. RFC 8693 token exchange is the load-bearing primitive — exchange the broad agent credential for a narrow task credential at task start.
Log the chain. Whatever your audit format is, make sure every record names the user, the agent, the workload (whatever your weakest version of "workload" is), and the task. If you can't reconstruct the chain from your logs, you can't answer the question.

The confused deputy is older than most of the tools we use to fix it. The fixes are clearer now than they've ever been. It mostly takes the willingness to treat credentials as narrow, expiring, and attributable, all at once.

End of post

Want to talk to us? Talk to founders or email the team.