Agent Permission Guardrails

Table of Contents

2026-agent-permission-guardrails.png

1. The Bottleneck

Agent capability is outpacing agent governance. Models generate correct SQL, invoke APIs, and modify infrastructure — but the permission model for letting them do so safely in production does not exist at most organizations.

The failure mode is not hallucinated code. It is correctly-generated destructive operations executed without authorization: DROP TABLE, TRUNCATE, bulk deletes, credential rotation. The agent works exactly as designed; the problem is that "as designed" includes operations no human reviewer approved.

1.1. Probabilistic vs Deterministic Controls

Fowler's Vibesec Reckoning (2026-05-29) draws the distinction clearly:

  • Probabilistic: prompts, system instructions, model alignment. "Telling an AI agent to be safe is not the same as enforcing that it is safe. Prompts can be overridden, misunderstood, or ignored."
  • Deterministic: wire-level interception, policy engines, non-negotiable rules codified in workflows.

The market is moving toward requiring both. Fowler cites 42% of new enterprise software as AI-generated or AI-assisted, 78% of codebases containing high or critical vulnerabilities, and a 44% year-on-year rise in application vulnerability attacks.

2. Three Enforcement Layers

These are distinct and non-redundant. A complete architecture uses all three.

Layer What it controls Example
Identity Who the agent is MCP OAuth 2.1, API keys
Session scope Which tools/resources the session can access Cerbos, OpenFGA, OPA
Per-operation What the agent is doing right now hoop.dev wire-level blocking

2.1. MCP Authorization Gap

The MCP specification makes authorization optional. OAuth 2.1 with PKCE is supported but not required. Critically, stdio transports — the most common deployment model for Claude Code, Cursor, and other developer tools — are explicitly excluded from the OAuth flow.

MCP handles identity (who the client is) but not per-tool-call authorization. The access token grants access to the MCP server; which tools are available is determined by the server implementation. There are no OAuth scopes at the individual tool level.

This means most developer-facing MCP tooling operates with session-wide permission grants and no per-operation scope.

3. Wire-Level Enforcement: hoop.dev

hoophq/hoop (v1.82.1, Go, 706 stars, YC-backed) is a layer 7 gateway that parses wire protocols natively:

  • Databases: PostgreSQL, MySQL, MSSQL, MongoDB, DynamoDB, Oracle, Redis
  • Runtime: Kubernetes (kubectl), SSH, HTTP/gRPC, RDP
  • AI/agent: MCP server, Claude Code, Cursor

Policy primitives:

  • Block by content: DROP TABLE, DELETE without WHERE, rm -rf, kubectl delete namespace, custom patterns
  • Mask by ML classification: PII, PHI, PCI — classified by context, not regex
  • Approval workflow: route risky operations to Slack/Teams, time-bound
  • Fail-closed: unknown operations denied by default
  • Latency: under 5ms for inline blocking/masking

hoop acts as both Policy Decision Point and Policy Enforcement Point at the wire level. The caller's tooling connects through the gateway without modification — no SDK, no plugin.

3.1. hoop vs PAM, DLP, and LLM Guardrails

Category Where it sits What it catches
PAM Connection layer Who connects
hoop Wire protocol What runs after connection
DLP Endpoint/egress Leaks after data reaches endpoint
LLM guardrails Prompt layer What the model generates

4. Policy Decision Points (Application-Layer)

These answer authorization questions when asked. None operate as transparent wire-level proxies. They require the application (or MCP server) to be the Policy Enforcement Point.

4.1. Cerbos

Cerbos — open-source PDP with policy-as-YAML (RBAC, ABAC, PBAC). SDKs for JS, Python, Go, Rust, Java, .NET.

Explicitly targeting MCP: the MCP server calls Cerbos at session start, tools are enabled/disabled per-user based on policy response. Strong for known-scope tool sets. Weaker for dynamically-generated operations (ad-hoc SQL) where content inspection is needed.

4.2. OpenFGA

OpenFGA — CNCF incubating, Google Zanzibar-inspired relationship-based access control. Answers "can user X do action Y on resource Z?" via a modeling language. Addresses hierarchical authorization (org > workspace > resource), not operation content.

4.3. OPA (Open Policy Agent)

Rego policy language, widely used for Kubernetes admission control and API gateway authorization. No native database proxy or SQL interception. Requires application integration.

5. Comparison

System Layer Approach Protocols Per-Operation Open Source
hoop.dev Wire L7 gateway, PDP+PEP SQL, MCP, k8s, SSH Yes (content) Yes (Go)
Cerbos App PDP, app enforces Any (via SDK) Via SDK call Yes
OpenFGA App Zanzibar relations Any (via API) Via API call Yes (CNCF)
OPA App Rego policy k8s, HTTP Via call-out Yes (CNCF)
Workday/Sana SoR Embedded governance Workday APIs Yes No
Zscaler (TBA) Network ZTNA extension TBD TBD No
FreeBSD Jails OS Process isolation Filesystem N/A (boundary) Yes (OS)

6. The Database Proxy Gap

There is no widely-adopted open-source project doing SQL-level destructive query blocking as a standalone Postgres wire-protocol proxy outside hoop.

  • PgBouncer: connection pooler only, no SQL filtering
  • Supavisor: Elixir connection pooler, no query filtering
  • pgaudit: logging at extension layer (audit, not enforcement)
  • PlanetScale branching: schema-change safety via workflow, not runtime
  • Cloudflare Hyperdrive: connection pooling and caching, no policy

A narrower entry point: a sidecar in front of PostgreSQL that blocks destructive DDL and unbounded DML.

Concrete operations to intercept:

  • DROP TABLE, DROP DATABASE, TRUNCATE
  • DELETE without WHERE clause
  • UPDATE affecting >N rows (configurable threshold)
  • ALTER TABLE DROP COLUMN
  • Any DDL in production outside a migration window

A live demo showing an agent attempting TRUNCATE users and being blocked at the wire is more compelling than a slide deck about policy engines.

7. Sandboxes vs Guardrails

Distinct mechanisms, complementary:

  • Guardrails operate inline on operations an agent attempts. Block DROP TABLE before it reaches the database.
  • Sandboxes bound the blast radius of operations that succeed. The agent runs in an isolated VM/jail; damage is contained.

exe.dev is a VM provider (hypervisor-level isolation, HTTPS access). Same tier as Modal, Fly.io, E2B — compute isolation, not operation-level policy. The agent-relevance is that untrusted agent code runs in a discardable VM. Not a guardrail product; a sandbox product.

8. AI Tinkerers Boston: L6 Trust / L7 Coordination

The March 2026 AI Tinkerers Boston meetup included a presentation by Will Sergeant: Agentic Trust in the Age of Dangerously Skipping Permissions.

Central thesis: current agentic frameworks operate with session-wide permission grants at startup. No per-operation scope, no task-level boundary, no least-privilege scoping within a session. "Dangerously skipping permissions" is the default posture, not a misconfiguration.

Failure modes identified:

  • Tool-call injection: malicious document content triggers tool execution
  • Credential exposure: agents read .env and .ssh as routine I/O
  • Audit trail gaps: agentic decisions absent from app logs (HIPAA/SOC 2)
  • Multi-agent lateral movement: compromised sub-agent poisons orchestrator

The presentation observed that every other demo at the meetup (dark factories, drone control, code pre-review) assumed a permission model that was either absent or trivially bypassable. "Nobody asked about fail-safes in Q&A" during the drone demo.

9. Tech Week Boston 2026

The AI Infra track at Tech Week Boston 2026 covers adjacent topics.

10. Related Work