Agent Sandbox Architectures: Where the Boundary Sits, and Who Holds the Key
A decomposition exercise: Cloudflare Sandbox SDK, Docker Sandboxes (sbx), Deno Sandbox, the browser, and what the local homelab still lacks

1. What "sandbox" names
2. Reference axiom: both isolations, or neither
3. The field, decomposed
4. The axis the field is still building: secret custody
5. The empty cell
6. The egress proxy: one shape under four boundaries
- 6.1. LiteLLM is the local instance – for one class of host
7. Integration surface: wrap a harness, or be the SDK
8. The product layer: pricing, gating, lock-in
9. What composes, what complects
10. Field notes – installing sbx on mini
11. Related
12. Anchors

1. What "sandbox" names

"Sandbox" is one word for at least four distinct isolations.

1. Compute boundary:    the kernel/VM line untrusted code cannot cross.
2. Filesystem custody:  what the code can read and write.
3. Network egress:      what hosts the code can reach.
4. Secret custody:      credentials the code uses but must not exfiltrate.

A fifth concern – lifecycle (ephemeral, snapshot, volume) – is orthogonal and rides on top of the four.

The four have different threat models and different enforcement points. Most products that present a single "sandbox" guarantee are quietly conflating two or more of them, and the conflation is usually between (1) and the other three: vendors sell the compute boundary – "microVM isolation, hard security boundary" – and leave (3) and (4) to a config file the operator may never write. The interesting question is not "which sandbox is most isolated" but "where does each draw the four boundaries, and what gets complected as a result?"

The threat model has shifted, and the shift is the whole story. We are no longer sandboxing untrusted plugins. We are sandboxing untrusted code that we ourselves generated, that runs without review, that carries real credentials, and that an attacker can steer by prompt injection. Ryan Dahl's framing of Deno Sandbox names it exactly: "LLM-generated code, calling external APIs with real credentials, without human review. Sandboxing the compute isn't enough."

2. Reference axiom: both isolations, or neither

Anthropic's sandbox-runtime README states the invariant this document tests against:

Both filesystem and network isolation are required for effective sandboxing. Without file isolation, a compromised process could exfiltrate SSH keys or other sensitive files. Without network isolation, a process could escape the sandbox and gain unrestricted network access.

Read it as a falsification condition. A system that isolates the filesystem but lets the process open arbitrary sockets has not built a sandbox; it has built a launchpad. A system that blocks egress but mounts ~/.ssh has built a different launchpad. The axiom is the conjunction, and the conjunction is where most of the field is still weak – not because compute isolation is hard (microVMs solved that) but because egress and secret custody are policy problems, and policy defaults to permissive.

The contract has three properties worth naming:

The boundary is structural, the policy is declared. The microVM or container gives you (1) for free. Whether you get (3) and (4) depends on configuration the operator supplies. Capability is structural; safety is declarative.
The chokepoint is the egress proxy. Every system that controls network egress does it at one point – an outbound proxy the sandbox cannot bypass. coder/httpjail and the Worker request proxy are the same shape: one place where policy is enforced, because one place is the only place you can enforce it.
Secrets are the asymmetric risk. A blocked egress is an annoyance; an exfiltrated long-lived credential is a breach that outlives the sandbox. Secret custody is the axis where the cost of getting it wrong is unbounded, and it is the axis the field has decomposed least.

That third property is the one this document organizes around.

3. The field, decomposed

3.1. Docker Sandboxes (sbx) – microVM for local agents

The compute boundary is a dedicated microVM per agent; the host stays untouched. Filesystem custody is "only your project workspace mounted in" – a sharp, defensible default. Network and filesystem controls are "controls you define," which is the honest admission that (3) is policy, not structure. The headline is that --dangerously-skip-permissions is the default: YOLO mode is safe precisely because the box around it is hard.

What is decomposed: the host from the agent. Cleanly. The agent can install packages, rewrite configs, even spin up its own Docker containers, and none of it touches the host.

What is complected: autonomy with configured policy. The microVM is structural; the network allowlist is not. Out of the box you get a hard wall around a workspace with broadly open egress – fine for "run the test suite," load-bearing-but-empty for "don't let prompt-injected code POST my repo to evil.com." Secret custody exists (sbx secret) but injects into the sandbox environment; the secret is present, and present means exfiltratable.

3.2. Cloudflare Sandbox SDK – credential custody as a separate tier

The compute boundary is a Container on Workers. Filesystem is a full Linux environment inside the container. Lifecycle composes cleanly with persistence: ephemeral by default, with R2/S3 buckets mounted as local filesystems for state that outlives the box.

The decomposition that matters is secret custody. Cloudflare's request-proxying pattern keeps credentials in the Worker and never in the sandbox: "A Worker proxy validates short-lived JWT tokens from the sandbox and injects real credentials at request time." The sandbox holds a capability (a short-lived JWT), not the credential. Compromise of the sandbox yields a token that expires, not a key that persists.

What is complected: the sandbox lifecycle with the Workers request model. The box is a Durable-Object-backed thing reached through getSandbox(env.Sandbox, 'user-123'); it is excellent if your control plane is already a Worker, and an impedance mismatch if it is a homelab launchd job on a Mac Mini.

3.3. Deno Sandbox – the secret that is never present

Sub-second-boot microVMs; volumes and snapshots for persistence; a 30-minute maximum lifetime. The technical envelope is modest (2 vCPU, up to 4 GB) and explicitly aimed at "AI agents executing code."

The move that occupies an empty cell: secrets never enter the environment. Code sees DENO_SECRET_PLACEHOLDER_b14043a2...; the real key materializes only when the sandbox makes an outbound request to a pre-approved host.

await using sandbox = await Sandbox.create({
  secrets: {
    OPENAI_API_KEY: {
      hosts: ["api.openai.com"],
      value: process.env.OPENAI_API_KEY,
    },
  },
  allowNet: ["api.openai.com", "*.anthropic.com"],
});

Prompt-injected code that exfiltrates the placeholder to evil.com exfiltrates a string with no value. The key binds to a destination, not to a process. This is the cleanest cut on the secret-custody axis in the field: it separates the secret's value from the secret's presence. The enforcement point is, again, an outbound proxy (Deno's docs cite coder/httpjail as the model), which is the same chokepoint everyone converges on.

What is complected: the production path. sandbox.deploy() is frictionless precisely because it ties the sandbox to Deno Deploy – sandbox and serverless host are the same vendor, so dev-to-prod is one call. That is composition for a Deno shop and lock-in for everyone else.

3.4. The browser – the 30-year-old sandbox, reused

Paul Kinlan's "the browser is the sandbox" tests the hypothesis that the origin sandbox – built to run hostile untrusted code the instant you tap a link – is good enough for agentic file work. The decomposition is honest about where it holds and where it does not.

Compute boundary: the origin sandbox. Mature, battle-tested, free.
Filesystem: the File System Access API gives a chroot-like handle to one user-selected directory – read/write within, no access to siblings or parents. Layer it with the origin-private filesystem and you can edit a copy while leaving the original intact.
Network egress: this is where the model breaks. "Unless you have an entirely client-side LLM, you can't" fully control egress – the data must leave to reach the model. CSP is "our friend" here, but it is partial. The classic aporia: an <img> whose URL encodes sensitive file contents is an expected web behavior and an exfiltration channel at once. You cannot separate "render an image" from "send data to the image's host" without giving up the first.

What is complected: display and egress. The browser's whole purpose is to fetch and render from anywhere; that purpose is an egress channel. The filesystem story is genuinely good; the network story is the reference axiom's second clause, unmet, and CSP only narrows it.

3.5. macOS sandbox-exec / Seatbelt – the profile as the contract

The substrate Anthropic's experiment and Co-do both gesture at: sandbox-exec applies a Scheme-syntax profile (Seatbelt) to a process, allow/deny on file paths and network operations. The boundary is the host kernel, not a VM. It is the lightest-weight option and the one with the worst ergonomics – the profile language is undocumented, deprecated-but-present, and unforgiving. It decomposes (2) and (3) into one declarative profile; it complects nothing because it is nothing but policy. Its cost is that you author that policy in a language Apple stopped documenting a decade ago.

4. The axis the field is still building: secret custody

Memory systems had a reactive/proactive axis with an empty substrate-initiated cell. Sandboxes have an analogous axis, and it is where the secret lives. Three positions, increasing in safety:

Position	Where the credential is	Compromise yields	Occupied by
Secret in environment	env var inside the box	the live, long-lived key	sbx, sandbox-exec, most local
Secret in trusted proxy	the Worker; box holds a JWT	a short-lived token	Cloudflare request proxying
Secret never present	placeholder; binds at egress	a useless string	Deno Sandbox

The distinction is what an attacker who fully owns the sandbox walks away with. Position one hands them the key. Position two hands them a token that expires. Position three hands them a placeholder bound to a host they do not control.

5. The empty cell

Cross the secret-custody axis with where the sandbox runs and a cell is empty:

Where it runs	Secret in env	Secret never present
Cloud (vendor edge)	common	Deno Sandbox
Local / self-hosted	sbx, Seatbelt	(no production tool)

The bottom-right cell – a locally-run agent sandbox that materializes secrets only at approved-host egress – is unoccupied. sbx runs locally and has a secret store, but the secret is injected into the box; it is present. Deno does the placeholder trick but only in Deno's cloud. For a LAN-only homelab running local coding agents, the option that protects a long-lived API key from prompt-injected exfiltration without shipping the workload to a vendor edge does not yet exist as a general product. It is buildable – coder/httpjail plus a local chokepoint is the whole recipe – and, as the next section argues, LiteLLM already ships this exact shape for one class of host while nothing ships it for arbitrary egress.

6. The egress proxy: one shape under four boundaries

Position two of the secret-custody axis – credential in a trusted proxy, injected at the boundary – is not a Cloudflare quirk. It is the convergent shape, and naming it is the point.

The pattern: an agent in a sandbox tries to git push. It has no SSH key and no token. The egress proxy intercepts the connection, looks up the identity the sandbox was started with, and injects the credential at the network boundary. From the agent's perspective it "just worked"; from the security perspective the credential never crossed into the code-execution environment. The full sequence – credentialless request → cred fetch from vault → token injection → proxied call → stripped response – is diagrammed in Cloudflare Agents Week, §The egress proxy.

This inverts the standard model. The standard model hands the agent a credential and asks it to be careful; prompt injection turns "careful" into "exfiltrate." The egress-proxy model never lets the authority exist as an artifact the agent can read. On the credential-provenance axis it is strictly better. On operator control, LAN latency, and vendor lock-in the comparison runs the other way – which is exactly why the boundary's location is the whole question for a homelab.

Three boundaries, one shape:

Boundary owner	Holds the credential	Injects at
Cloudflare Sandbox egress	the Worker	edge egress proxy
Anthropic managed agents	vault + proxy	managed egress
Deno Sandbox	host process (value)	approved-host egress

Deno is the limiting case: the proxy holds the value and the sandbox holds only a placeholder, so even the reference is inert. The other two hold a real credential in the proxy and inject it; Deno keeps the value out of the sandbox's address space entirely.

6.1. LiteLLM is the local instance – for one class of host

mini already runs the pattern, narrowly. The LiteLLM proxy holds the model-provider keys; local agents call it with no key of their own, and the proxy injects the real credential toward api.anthropic.com / api.openai.com. That is egress-proxy credential injection for model endpoints – the same shape, scoped to one class of host. The guardrail layer on top (PII custom_code guards firing per request) is the inspect-and-modify hook Deno's docs promise and most proxies lack.

What LiteLLM does not do is the git push case: arbitrary-host egress with identity-bound credentials. That is the unoccupied cell from the previous section, restated precisely – mini has the proxy shape for model traffic and nothing for general egress. The economy side of the same problem – who may spend which credential, and how it is metered – is worked out in Agent Token Exchange.

7. Integration surface: wrap a harness, or be the SDK

The four isolations say nothing about who writes the code that runs inside the box. That is a separate axis, and it is the one that decides which sandbox a given harness can even use.

Surface	You supply	The sandbox runs	Systems
Wrap-a-harness	an agent CLI	the agent, unmodified	Docker sbx
SDK primitive	your own agent	code you wrote	Cloudflare Sandbox, Deno Sandbox
Build-in-page	a web app	browser-resident code	browser / Co-do
OS profile	any process	a profiled command	sandbox-exec, Anthropic sandbox-runtime

Docker sbx is the only one that treats the harness as the unit. sbx run takes an existing agent – Claude Code, Gemini CLI, Copilot CLI, Codex, OpenCode, Kiro – and runs it unmodified inside the microVM. You write no integration code; the sandbox is harness-agnostic because it wraps the process, not the logic. This is why its default is --dangerously-skip-permissions: the agent it wraps is one that otherwise stops to ask, and the box is what makes "skip the prompts" safe.

Cloudflare and Deno are the inverse. There is no agent to wrap; you call getSandbox() or Sandbox.create() from code you wrote, and the sandbox is the execution tier of an agent you are building. The integration is total because you author both sides.

The distinction is not cosmetic. A wrap-a-harness sandbox adopts a new coding agent the day it ships, because it never modeled the agent's internals. An SDK sandbox cannot run Claude Code for you at all – it is a primitive, not a host. For mini, where the agent is Claude Code, only the wrap-a-harness surface (sbx) and the OS-profile surface (sandbox-exec, the sandbox-runtime shape) are candidates at all.

8. The product layer: pricing, gating, lock-in

Where the boundary runs is also where the bill is. The commercial shape is a property of the sandbox as much as its isolation model, and it correlates with the secret-custody axis in a way worth naming.

System	Cost shape	Gating	Lock-in vector
Docker sbx	free core; team network/FS policy via sales	none for core	low – wraps any harness, runs local
Cloudflare Sandbox	Workers Paid plan; bucket mount needs production deploy	Workers Paid	high – Durable Objects, R2, Workers
Deno Sandbox	usage-based on Deno Deploy (≈$0.05/h CPU, $0.016/GB-h memory, $0.20/GiB-mo volume; Pro includes 40h / 1000 GB-h / 5 GiB)	Deno Deploy account	high – `sandbox.deploy()` binds to Deno Deploy
browser / Co-do	free (web-platform APIs)	none	none – and no managed boundary either
sandbox-exec	free (built into macOS)	none	none – but macOS-only, deprecated

Two observations:

Lock-in tracks the boundary's owner. The systems with the strongest secret custody – Cloudflare's Worker proxy, Deno's placeholder – are exactly the ones whose credential boundary is the vendor's edge. You buy provenance with dependence. sbx and sandbox-exec carry no lock-in precisely because they offer no managed credential boundary; that is the empty cell from earlier, seen from the billing side.
Free-to-start is not free-to-run-unattended. sbx installs free, but the network and filesystem policies that make YOLO mode genuinely safe for a team are the paid, talk-to-sales tier. The core product is the hard box; the governance is the upsell. For a single-operator homelab the free core is the whole product; for a team it is the loss leader.

9. What composes, what complects

System	Decomposed	Complected
Docker Sandboxes	host from agent (microVM, workspace)	autonomy with operator-set policy
Cloudflare Sandbox	credential custody from execution	lifecycle with Workers request model
Deno Sandbox	secret value from secret presence	dev-to-prod with Deno Deploy
Browser (Co-do)	filesystem custody (FS Access chroot)	display with network egress
sandbox-exec	filesystem and network into one profile	profile authored in dead language

The systems differ less in their compute boundary – microVMs are a commodity now – than in which of the four isolations they treat as structural and which they leave to policy. The reference axiom is satisfied by construction only where both (2) and (3) are structural; everywhere else, "sandboxed" means "isolated compute plus a config file you were trusted to write."

10. Field notes – installing sbx on mini

The experience-report part, since the claim should carry its own provenance.

brew install docker/tap/sbx failed first run: an HTTP/2 PROTOCOL_ERROR mid-download of the cask asset, compounded by a Homebrew auto-update that could not refresh formula.jws.json. The failure was transient, not a bad URL – the release asset resolves fine.
Re-running with auto-update disabled (brew install --cask docker/tap/sbx) succeeded. Binary links to /opt/homebrew/bin/sbx, with bash/fish/zsh completions.
sbx version → client v0.30.0; server "Unavailable (daemon not running – use 'sbx daemon start')." The CLI is a client to a daemon that brokers the microVMs; nothing runs until the daemon does, and Docker Desktop is explicitly not required.
The subcommand surface (create, run, exec, policy, secret, ports, template, kit) confirms the decomposition above: policy is where (3) lives, secret is where (4) lives, and both are opt-in. The default is the hard box with permissive policy – exactly the shape this document warns about.

For the homelab the fit is narrow: sbx is built for unattended agent runs, and mini's directive is that Gatus owns restarts and all telemetry goes to nexus. An sbx-wrapped agent would need its OTLP egress (192.168.86.100:4317) added to the network policy explicitly, or the hard box silently swallows the traces. That is the reference axiom biting in the friendly direction: egress control that is doing its job will block your observability too, until you declare it.

11. Related

Agent Memory Architectures – the companion decomposition; same compose-vs-complect lens, the reactive/proactive axis there mirrors the secret-custody axis here
Cloudflare Agents Week 2026 – §The egress proxy; the credential-injection sequence diagrammed in full
Agent Token Exchange – the economy side of credential management: who may spend which authority, and how it is metered
Agentic Systems Q4 2024 – MCE pattern; sandboxes are the execution tier of that architecture

12. Anchors

Anthropic. sandbox-runtime. https://github.com/anthropics/sandbox-runtime – the two-isolation axiom (filesystem ∧ network)
Cloudflare. Sandbox SDK. https://developers.cloudflare.com/sandbox/ – container on Workers; request proxying keeps credentials in the Worker
Docker. Docker Sandboxes. https://www.docker.com/products/docker-sandboxes/ – microVM per agent; brew install docker/tap/sbx
Dahl, R. Introducing Deno Sandbox. Deno blog, 3 Feb 2026. https://deno.com/blog/introducing-deno-sandbox – secret placeholders, allowNet, sub-second boot
Kinlan, P. the browser is the sandbox. AI Focus, 25 Jan 2026. https://aifoc.us/the-browser-is-the-sandbox/ – File System Access chroot, CSP as partial egress control
Willison, S. sandboxing. https://simonwillison.net/search/?q=sandbox – running coverage of the Anthropic sandbox experiment and sandbox-exec
coder. httpjail. https://github.com/coder/httpjail – the egress-proxy chokepoint both Deno and Cloudflare converge on
Apple. sandbox-exec(1) / Seatbelt – host-kernel profile sandboxing on macOS

Agent Sandbox Architectures: Where the Boundary Sits, and Who Holds the Key A decomposition exercise: Cloudflare Sandbox SDK, Docker Sandboxes (sbx), Deno Sandbox, the browser, and what the local homelab still lacks

Table of Contents