Agent Sandbox Architectures: Where the Boundary Sits, and Who Holds the Key
A decomposition exercise: Cloudflare Sandbox SDK, Docker Sandboxes (sbx), Deno Sandbox, the browser, and what the local homelab still lacks

Table of Contents

1. What "sandbox" names

"Sandbox" is one word for at least four distinct isolations.

1. Compute boundary:    the kernel/VM line untrusted code cannot cross.
2. Filesystem custody:  what the code can read and write.
3. Network egress:      what hosts the code can reach.
4. Secret custody:      credentials the code uses but must not exfiltrate.

A fifth concern — lifecycle (ephemeral, snapshot, volume) — is orthogonal and rides on top of the four.

The four have different threat models and different enforcement points. Most products that present a single "sandbox" guarantee are quietly conflating two or more of them, and the conflation is usually between (1) and the other three: vendors sell the compute boundary — "microVM isolation, hard security boundary" — and leave (3) and (4) to a config file the operator may never write. The interesting question is not "which sandbox is most isolated" but "where does each draw the four boundaries, and what gets complected as a result?"

The threat model has shifted, and the shift is the whole story. We are no longer sandboxing untrusted plugins. We are sandboxing untrusted code that we ourselves generated, that runs without review, that carries real credentials, and that an attacker can steer by prompt injection. Ryan Dahl's framing of Deno Sandbox names it exactly: "LLM-generated code, calling external APIs with real credentials, without human review. Sandboxing the compute isn't enough."

2. Reference axiom: both isolations, or neither

Anthropic's sandbox-runtime README states the invariant this document tests against:

Both filesystem and network isolation are required for effective sandboxing. Without file isolation, a compromised process could exfiltrate SSH keys or other sensitive files. Without network isolation, a process could escape the sandbox and gain unrestricted network access.

Read it as a falsification condition. A system that isolates the filesystem but lets the process open arbitrary sockets has not built a sandbox; it has built a launchpad. A system that blocks egress but mounts ~/.ssh has built a different launchpad. The axiom is the conjunction, and the conjunction is where most of the field is still weak — not because compute isolation is hard (microVMs solved that) but because egress and secret custody are policy problems, and policy defaults to permissive.

The contract has three properties worth naming:

  • The boundary is structural, the policy is declared. The microVM or container gives you (1) for free. Whether you get (3) and (4) depends on configuration the operator supplies. Capability is structural; safety is declarative.
  • The chokepoint is the egress proxy. Every system that controls network egress does it at one point — an outbound proxy the sandbox cannot bypass. coder/httpjail and the Worker request proxy are the same shape: one place where policy is enforced, because one place is the only place you can enforce it.
  • Secrets are the asymmetric risk. A blocked egress is an annoyance; an exfiltrated long-lived credential is a breach that outlives the sandbox. Secret custody is the axis where the cost of getting it wrong is unbounded, and it is the axis the field has decomposed least.

That third property is the one this document organizes around.

3. The field, decomposed

3.1. Docker Sandboxes (sbx) — microVM for local agents

The compute boundary is a dedicated microVM per agent; the host stays untouched. Filesystem custody is "only your project workspace mounted in" — a sharp, defensible default. Network and filesystem controls are "controls you define," which is the honest admission that (3) is policy, not structure. The headline is that --dangerously-skip-permissions is the default: YOLO mode is safe precisely because the box around it is hard.

What is decomposed: the host from the agent. Cleanly. The agent can install packages, rewrite configs, even spin up its own Docker containers, and none of it touches the host.

What is complected: autonomy with configured policy. The microVM is structural; the network allowlist is not. Out of the box you get a hard wall around a workspace with broadly open egress — fine for "run the test suite," load-bearing-but-empty for "don't let prompt-injected code POST my repo to evil.com." Secret custody exists (sbx secret) but injects into the sandbox environment; the secret is present, and present means exfiltratable.

3.2. Cloudflare Sandbox SDK — credential custody as a separate tier

The compute boundary is a Container on Workers. Filesystem is a full Linux environment inside the container. Lifecycle composes cleanly with persistence: ephemeral by default, with R2/S3 buckets mounted as local filesystems for state that outlives the box.

The decomposition that matters is secret custody. Cloudflare's request-proxying pattern keeps credentials in the Worker and never in the sandbox: "A Worker proxy validates short-lived JWT tokens from the sandbox and injects real credentials at request time." The sandbox holds a capability (a short-lived JWT), not the credential. Compromise of the sandbox yields a token that expires, not a key that persists.

What is complected: the sandbox lifecycle with the Workers request model. The box is a Durable-Object-backed thing reached through getSandbox(env.Sandbox, 'user-123'); it is excellent if your control plane is already a Worker, and an impedance mismatch if it is a homelab launchd job on a Mac Mini.

3.3. Deno Sandbox — the secret that is never present

Sub-second-boot microVMs; volumes and snapshots for persistence; a 30-minute maximum lifetime. The technical envelope is modest (2 vCPU, up to 4 GB) and explicitly aimed at "AI agents executing code."

The move that occupies an empty cell: secrets never enter the environment. Code sees DENO_SECRET_PLACEHOLDER_b14043a2...; the real key materializes only when the sandbox makes an outbound request to a pre-approved host.

await using sandbox = await Sandbox.create({
  secrets: {
    OPENAI_API_KEY: {
      hosts: ["api.openai.com"],
      value: process.env.OPENAI_API_KEY,
    },
  },
  allowNet: ["api.openai.com", "*.anthropic.com"],
});

Prompt-injected code that exfiltrates the placeholder to evil.com exfiltrates a string with no value. The key binds to a destination, not to a process. This is the cleanest cut on the secret-custody axis in the field: it separates the secret's value from the secret's presence. The enforcement point is, again, an outbound proxy (Deno's docs cite coder/httpjail as the model), which is the same chokepoint everyone converges on.

What is complected: the production path. sandbox.deploy() is frictionless precisely because it ties the sandbox to Deno Deploy — sandbox and serverless host are the same vendor, so dev-to-prod is one call. That is composition for a Deno shop and lock-in for everyone else.

3.4. The browser — the 30-year-old sandbox, reused

Paul Kinlan's "the browser is the sandbox" tests the hypothesis that the origin sandbox — built to run hostile untrusted code the instant you tap a link — is good enough for agentic file work. The decomposition is honest about where it holds and where it does not.

  • Compute boundary: the origin sandbox. Mature, battle-tested, free.
  • Filesystem: the File System Access API gives a chroot-like handle to one user-selected directory — read/write within, no access to siblings or parents. Layer it with the origin-private filesystem and you can edit a copy while leaving the original intact.
  • Network egress: this is where the model breaks. "Unless you have an entirely client-side LLM, you can't" fully control egress — the data must leave to reach the model. CSP is "our friend" here, but it is partial. The classic aporia: an <img> whose URL encodes sensitive file contents is an expected web behavior and an exfiltration channel at once. You cannot separate "render an image" from "send data to the image's host" without giving up the first.

What is complected: display and egress. The browser's whole purpose is to fetch and render from anywhere; that purpose is an egress channel. The filesystem story is genuinely good; the network story is the reference axiom's second clause, unmet, and CSP only narrows it.

3.5. macOS sandbox-exec / Seatbelt — the profile as the contract

The substrate Anthropic's experiment and Co-do both gesture at: sandbox-exec applies a Scheme-syntax profile (Seatbelt) to a process, allow/deny on file paths and network operations. The boundary is the host kernel, not a VM. It is the lightest-weight option and the one with the worst ergonomics — the profile language is undocumented, deprecated-but-present, and unforgiving. It decomposes (2) and (3) into one declarative profile; it complects nothing because it is nothing but policy. Its cost is that you author that policy in a language Apple stopped documenting a decade ago.

4. The axis the field is still building: secret custody

Memory systems had a reactive/proactive axis with an empty substrate-initiated cell. Sandboxes have an analogous axis, and it is where the secret lives. Three positions, increasing in safety:

Position Where the credential is Compromise yields Occupied by
Secret in environment env var inside the box the live, long-lived key sbx, sandbox-exec, most local
Secret in trusted proxy the Worker; box holds a JWT a short-lived token Cloudflare request proxying
Secret never present placeholder; binds at egress a useless string Deno Sandbox

The distinction is what an attacker who fully owns the sandbox walks away with. Position one hands them the key. Position two hands them a token that expires. Position three hands them a placeholder bound to a host they do not control.

5. The empty cell

Cross the secret-custody axis with where the sandbox runs and a cell is empty:

Where it runs Secret in env Secret never present
Cloud (vendor edge) common Deno Sandbox
Local / self-hosted sbx, Seatbelt (no production tool)

The bottom-right cell — a locally-run agent sandbox that materializes secrets only at approved-host egress — is unoccupied. sbx runs locally and has a secret store, but the secret is injected into the box; it is present. Deno does the placeholder trick but only in Deno's cloud. For a LAN-only homelab running local coding agents, the option that protects a long-lived API key from prompt-injected exfiltration without shipping the workload to a vendor edge does not yet exist as a general product. It is buildable — coder/httpjail plus a local chokepoint is the whole recipe — and, as the next section argues, LiteLLM already ships this exact shape for one class of host while nothing ships it for arbitrary egress.

6. The egress proxy: one shape under four boundaries

Position two of the secret-custody axis — credential in a trusted proxy, injected at the boundary — is not a Cloudflare quirk. It is the convergent shape, and naming it is the point.

The pattern: an agent in a sandbox tries to git push. It has no SSH key and no token. The egress proxy intercepts the connection, looks up the identity the sandbox was started with, and injects the credential at the network boundary. From the agent's perspective it "just worked"; from the security perspective the credential never crossed into the code-execution environment. The full sequence — credentialless request → cred fetch from vault → token injection → proxied call → stripped response — is diagrammed in Cloudflare Agents Week, §The egress proxy.

This inverts the standard model. The standard model hands the agent a credential and asks it to be careful; prompt injection turns "careful" into "exfiltrate." The egress-proxy model never lets the authority exist as an artifact the agent can read. On the credential-provenance axis it is strictly better. On operator control, LAN latency, and vendor lock-in the comparison runs the other way — which is exactly why the boundary's location is the whole question for a homelab.

Three boundaries, one shape:

Boundary owner Holds the credential Injects at
Cloudflare Sandbox egress the Worker edge egress proxy
Anthropic managed agents vault + proxy managed egress
Deno Sandbox host process (value) approved-host egress

Deno is the limiting case: the proxy holds the value and the sandbox holds only a placeholder, so even the reference is inert. The other two hold a real credential in the proxy and inject it; Deno keeps the value out of the sandbox's address space entirely.

6.1. LiteLLM is the local instance — for one class of host

mini already runs the pattern, narrowly. The LiteLLM proxy holds the model-provider keys; local agents call it with no key of their own, and the proxy injects the real credential toward api.anthropic.com / api.openai.com. That is egress-proxy credential injection for model endpoints — the same shape, scoped to one class of host. The guardrail layer on top (PII custom_code guards firing per request) is the inspect-and-modify hook Deno's docs promise and most proxies lack.

What LiteLLM does not do is the git push case: arbitrary-host egress with identity-bound credentials. That is the unoccupied cell from the previous section, restated precisely — mini has the proxy shape for model traffic and nothing for general egress. The economy side of the same problem — who may spend which credential, and how it is metered — is worked out in Agent Token Exchange.

7. Integration surface: wrap a harness, or be the SDK

The four isolations say nothing about who writes the code that runs inside the box. That is a separate axis, and it is the one that decides which sandbox a given harness can even use.

Surface You supply The sandbox runs Systems
Wrap-a-harness an agent CLI the agent, unmodified Docker sbx
SDK primitive your own agent code you wrote Cloudflare Sandbox, Deno Sandbox
Build-in-page a web app browser-resident code browser / Co-do
OS profile any process a profiled command sandbox-exec, Anthropic sandbox-runtime

Docker sbx is the only one that treats the harness as the unit. sbx run takes an existing agent — Claude Code, Gemini CLI, Copilot CLI, Codex, OpenCode, Kiro — and runs it unmodified inside the microVM. You write no integration code; the sandbox is harness-agnostic because it wraps the process, not the logic. This is why its default is --dangerously-skip-permissions: the agent it wraps is one that otherwise stops to ask, and the box is what makes "skip the prompts" safe.

Cloudflare and Deno are the inverse. There is no agent to wrap; you call getSandbox() or Sandbox.create() from code you wrote, and the sandbox is the execution tier of an agent you are building. The integration is total because you author both sides.

The distinction is not cosmetic. A wrap-a-harness sandbox adopts a new coding agent the day it ships, because it never modeled the agent's internals. An SDK sandbox cannot run Claude Code for you at all — it is a primitive, not a host. For mini, where the agent is Claude Code, only the wrap-a-harness surface (sbx) and the OS-profile surface (sandbox-exec, the sandbox-runtime shape) are candidates at all.

8. The product layer: pricing, gating, lock-in

Where the boundary runs is also where the bill is. The commercial shape is a property of the sandbox as much as its isolation model, and it correlates with the secret-custody axis in a way worth naming.

System Cost shape Gating Lock-in vector
Docker sbx free core; team network/FS policy via sales none for core low — wraps any harness, runs local
Cloudflare Sandbox Workers Paid plan; bucket mount needs production deploy Workers Paid high — Durable Objects, R2, Workers
Deno Sandbox usage-based on Deno Deploy (≈$0.05/h CPU, $0.016/GB-h memory, $0.20/GiB-mo volume; Pro includes 40h / 1000 GB-h / 5 GiB) Deno Deploy account high — sandbox.deploy() binds to Deno Deploy
browser / Co-do free (web-platform APIs) none none — and no managed boundary either
sandbox-exec free (built into macOS) none none — but macOS-only, deprecated

Two observations:

  • Lock-in tracks the boundary's owner. The systems with the strongest secret custody — Cloudflare's Worker proxy, Deno's placeholder — are exactly the ones whose credential boundary is the vendor's edge. You buy provenance with dependence. sbx and sandbox-exec carry no lock-in precisely because they offer no managed credential boundary; that is the empty cell from earlier, seen from the billing side.
  • Free-to-start is not free-to-run-unattended. sbx installs free, but the network and filesystem policies that make YOLO mode genuinely safe for a team are the paid, talk-to-sales tier. The core product is the hard box; the governance is the upsell. For a single-operator homelab the free core is the whole product; for a team it is the loss leader.

9. What composes, what complects

System Decomposed Complected
Docker Sandboxes host from agent (microVM, workspace) autonomy with operator-set policy
Cloudflare Sandbox credential custody from execution lifecycle with Workers request model
Deno Sandbox secret value from secret presence dev-to-prod with Deno Deploy
Browser (Co-do) filesystem custody (FS Access chroot) display with network egress
sandbox-exec filesystem and network into one profile profile authored in dead language

The systems differ less in their compute boundary — microVMs are a commodity now — than in which of the four isolations they treat as structural and which they leave to policy. The reference axiom is satisfied by construction only where both (2) and (3) are structural; everywhere else, "sandboxed" means "isolated compute plus a config file you were trusted to write."

10. Field notes — installing sbx on mini

The experience-report part, since the claim should carry its own provenance.

  • brew install docker/tap/sbx failed first run: an HTTP/2 PROTOCOL_ERROR mid-download of the cask asset, compounded by a Homebrew auto-update that could not refresh formula.jws.json. The failure was transient, not a bad URL — the release asset resolves fine.
  • Re-running with auto-update disabled (brew install --cask docker/tap/sbx) succeeded. Binary links to /opt/homebrew/bin/sbx, with bash/fish/zsh completions.
  • sbx version → client v0.30.0; server "Unavailable (daemon not running — use 'sbx daemon start')." The CLI is a client to a daemon that brokers the microVMs; nothing runs until the daemon does, and Docker Desktop is explicitly not required.
  • The subcommand surface (create, run, exec, policy, secret, ports, template, kit) confirms the decomposition above: policy is where (3) lives, secret is where (4) lives, and both are opt-in. The default is the hard box with permissive policy — exactly the shape this document warns about.

For the homelab the fit is narrow: sbx is built for unattended agent runs, and mini's directive is that Gatus owns restarts and all telemetry goes to nexus. An sbx-wrapped agent would need its OTLP egress (192.168.86.100:4317) added to the network policy explicitly, or the hard box silently swallows the traces. That is the reference axiom biting in the friendly direction: egress control that is doing its job will block your observability too, until you declare it.

11. Related

  • Agent Memory Architectures — the companion decomposition; same compose-vs-complect lens, the reactive/proactive axis there mirrors the secret-custody axis here
  • Cloudflare Agents Week 2026 — §The egress proxy; the credential-injection sequence diagrammed in full
  • Agent Token Exchange — the economy side of credential management: who may spend which authority, and how it is metered
  • Agentic Systems Q4 2024 — MCE pattern; sandboxes are the execution tier of that architecture

12. Anchors