Agent Permission Guardrails
Table of Contents
1. The Bottleneck
Agent capability is outpacing agent governance. Models generate correct SQL, invoke APIs, and modify infrastructure — but the permission model for letting them do so safely in production does not exist at most organizations.
The failure mode is not hallucinated code. It is correctly-generated
destructive operations executed without authorization: DROP TABLE,
TRUNCATE, bulk deletes, credential rotation. The agent works exactly
as designed; the problem is that "as designed" includes operations no
human reviewer approved.
1.1. Probabilistic vs Deterministic Controls
Fowler's Vibesec Reckoning (2026-05-29) draws the distinction clearly:
- Probabilistic: prompts, system instructions, model alignment. "Telling an AI agent to be safe is not the same as enforcing that it is safe. Prompts can be overridden, misunderstood, or ignored."
- Deterministic: wire-level interception, policy engines, non-negotiable rules codified in workflows.
The market is moving toward requiring both. Fowler cites 42% of new enterprise software as AI-generated or AI-assisted, 78% of codebases containing high or critical vulnerabilities, and a 44% year-on-year rise in application vulnerability attacks.
2. Three Enforcement Layers
These are distinct and non-redundant. A complete architecture uses all three.
| Layer | What it controls | Example |
|---|---|---|
| Identity | Who the agent is | MCP OAuth 2.1, API keys |
| Session scope | Which tools/resources the session can access | Cerbos, OpenFGA, OPA |
| Per-operation | What the agent is doing right now | hoop.dev wire-level blocking |
2.1. MCP Authorization Gap
The MCP specification makes authorization optional. OAuth 2.1 with PKCE is supported but not required. Critically, stdio transports — the most common deployment model for Claude Code, Cursor, and other developer tools — are explicitly excluded from the OAuth flow.
MCP handles identity (who the client is) but not per-tool-call authorization. The access token grants access to the MCP server; which tools are available is determined by the server implementation. There are no OAuth scopes at the individual tool level.
This means most developer-facing MCP tooling operates with session-wide permission grants and no per-operation scope.
3. Wire-Level Enforcement: hoop.dev
hoophq/hoop (v1.82.1, Go, 706 stars, YC-backed) is a layer 7 gateway that parses wire protocols natively:
- Databases: PostgreSQL, MySQL, MSSQL, MongoDB, DynamoDB, Oracle, Redis
- Runtime: Kubernetes (
kubectl), SSH, HTTP/gRPC, RDP - AI/agent: MCP server, Claude Code, Cursor
Policy primitives:
- Block by content:
DROP TABLE,DELETEwithoutWHERE,rm -rf,kubectl delete namespace, custom patterns - Mask by ML classification: PII, PHI, PCI — classified by context, not regex
- Approval workflow: route risky operations to Slack/Teams, time-bound
- Fail-closed: unknown operations denied by default
- Latency: under 5ms for inline blocking/masking
hoop acts as both Policy Decision Point and Policy Enforcement Point at the wire level. The caller's tooling connects through the gateway without modification — no SDK, no plugin.
3.1. hoop vs PAM, DLP, and LLM Guardrails
| Category | Where it sits | What it catches |
|---|---|---|
| PAM | Connection layer | Who connects |
| hoop | Wire protocol | What runs after connection |
| DLP | Endpoint/egress | Leaks after data reaches endpoint |
| LLM guardrails | Prompt layer | What the model generates |
4. Policy Decision Points (Application-Layer)
These answer authorization questions when asked. None operate as transparent wire-level proxies. They require the application (or MCP server) to be the Policy Enforcement Point.
4.1. Cerbos
Cerbos — open-source PDP with policy-as-YAML (RBAC, ABAC, PBAC). SDKs for JS, Python, Go, Rust, Java, .NET.
Explicitly targeting MCP: the MCP server calls Cerbos at session start, tools are enabled/disabled per-user based on policy response. Strong for known-scope tool sets. Weaker for dynamically-generated operations (ad-hoc SQL) where content inspection is needed.
4.2. OpenFGA
OpenFGA — CNCF incubating, Google Zanzibar-inspired relationship-based access control. Answers "can user X do action Y on resource Z?" via a modeling language. Addresses hierarchical authorization (org > workspace > resource), not operation content.
4.3. OPA (Open Policy Agent)
Rego policy language, widely used for Kubernetes admission control and API gateway authorization. No native database proxy or SQL interception. Requires application integration.
5. Comparison
| System | Layer | Approach | Protocols | Per-Operation | Open Source |
|---|---|---|---|---|---|
| hoop.dev | Wire | L7 gateway, PDP+PEP | SQL, MCP, k8s, SSH | Yes (content) | Yes (Go) |
| Cerbos | App | PDP, app enforces | Any (via SDK) | Via SDK call | Yes |
| OpenFGA | App | Zanzibar relations | Any (via API) | Via API call | Yes (CNCF) |
| OPA | App | Rego policy | k8s, HTTP | Via call-out | Yes (CNCF) |
| Workday/Sana | SoR | Embedded governance | Workday APIs | Yes | No |
| Zscaler (TBA) | Network | ZTNA extension | TBD | TBD | No |
| FreeBSD Jails | OS | Process isolation | Filesystem | N/A (boundary) | Yes (OS) |
6. The Database Proxy Gap
There is no widely-adopted open-source project doing SQL-level destructive query blocking as a standalone Postgres wire-protocol proxy outside hoop.
- PgBouncer: connection pooler only, no SQL filtering
- Supavisor: Elixir connection pooler, no query filtering
- pgaudit: logging at extension layer (audit, not enforcement)
- PlanetScale branching: schema-change safety via workflow, not runtime
- Cloudflare Hyperdrive: connection pooling and caching, no policy
A narrower entry point: a sidecar in front of PostgreSQL that blocks destructive DDL and unbounded DML.
Concrete operations to intercept:
DROP TABLE,DROP DATABASE,TRUNCATEDELETEwithoutWHEREclauseUPDATEaffecting >N rows (configurable threshold)ALTER TABLE DROP COLUMN- Any DDL in production outside a migration window
A live demo showing an agent attempting TRUNCATE users and being
blocked at the wire is more compelling than a slide deck about policy
engines.
7. Sandboxes vs Guardrails
Distinct mechanisms, complementary:
- Guardrails operate inline on operations an agent attempts.
Block
DROP TABLEbefore it reaches the database. - Sandboxes bound the blast radius of operations that succeed. The agent runs in an isolated VM/jail; damage is contained.
exe.dev is a VM provider (hypervisor-level isolation, HTTPS access). Same tier as Modal, Fly.io, E2B — compute isolation, not operation-level policy. The agent-relevance is that untrusted agent code runs in a discardable VM. Not a guardrail product; a sandbox product.
8. AI Tinkerers Boston: L6 Trust / L7 Coordination
The March 2026 AI Tinkerers Boston meetup included a presentation by Will Sergeant: Agentic Trust in the Age of Dangerously Skipping Permissions.
Central thesis: current agentic frameworks operate with session-wide permission grants at startup. No per-operation scope, no task-level boundary, no least-privilege scoping within a session. "Dangerously skipping permissions" is the default posture, not a misconfiguration.
Failure modes identified:
- Tool-call injection: malicious document content triggers tool execution
- Credential exposure: agents read
.envand.sshas routine I/O - Audit trail gaps: agentic decisions absent from app logs (HIPAA/SOC 2)
- Multi-agent lateral movement: compromised sub-agent poisons orchestrator
The presentation observed that every other demo at the meetup (dark factories, drone control, code pre-review) assumed a permission model that was either absent or trivially bypassable. "Nobody asked about fail-safes in Q&A" during the drone demo.
9. Tech Week Boston 2026
The AI Infra track at Tech Week Boston 2026 covers adjacent topics.
10. Related Work
- Agent Isolation with FreeBSD Jails — OS-level sandboxing for compute
- Agent Sandbox Architectures — comparison across compute, filesystem, network, secret custody
- Agent Identity and Attestation — who is the agent acting as
- Walsh-Research Bot Compliance Spec — outbound agent behavior rules