gVisor vs Firecracker: Which Isolation for Agents?
Once you accept that a plain container isn't a hard boundary for untrusted code, two names show up over and over: gVisor and Firecracker. Both are legitimate, production-proven answers to "how do I run someone else's code without betting the host on it," and both are genuinely stronger than namespaces-and-cgroups. But they get there by very different routes — one rewrites the kernel in user space, the other boots a real one inside a VM — and that difference is exactly what you're choosing between. This is the honest head-to-head, with no strawman on either side.
If you want the broader map first, this comparison sits inside a larger ladder — see /blog/code-isolation-hierarchy for the full walk from bare process to confidential VM. Here we zoom in on the two rungs people most often agonize over for AI agents.
gVisor: a kernel rewritten in user space
gVisor (the runtime is called runsc) is a user-space kernel written in Go. Its core component, the Sentry, re-implements a large fraction of the Linux syscall ABI itself. When the sandboxed workload makes a syscall, gVisor intercepts it and services it inside the Sentry rather than passing it straight to the host kernel. The Sentry still has to touch the host eventually, but it does so through a small, tightly seccomp-filtered set of host syscalls — so instead of exposing the full Linux ABI (300-plus syscalls) to the host kernel, the guest mostly talks to gVisor, and only gVisor talks to the host, through a deliberately narrow door.
The win is real: you shrink the host-kernel attack surface dramatically without booting a second operating system. There's no guest kernel to manage, no virtio device model, and the startup cost is much closer to a container than to a VM. That's why gVisor is a sensible default for platforms that want stronger-than-container isolation while keeping a container-shaped operational model.
The cost lands in two places, and both are workload-dependent. First, compatibility: gVisor re-implements the kernel, and it doesn't implement all of it. Some syscalls and edge-case behaviors are unimplemented or differ subtly, so certain workloads — unusual syscalls, exotic filesystem semantics, some low-level networking or kernel-feature-dependent programs — can misbehave or fail outright. Most ordinary code runs fine; the long tail is where surprises live. Second, performance: because every syscall is intercepted and serviced in user space, syscall- and I/O-heavy workloads pay measurable overhead. CPU-bound code that rarely calls into the kernel barely notices; a program hammering the filesystem or doing tons of small reads and writes feels it. Verify the specifics against gVisor's own docs and your own workload — the trade-off genuinely depends on what your code does.
Firecracker: a real microVM with its own kernel
Firecracker takes the other road. It's a Virtual Machine Monitor that boots a genuine guest kernel, isolated by CPU hardware virtualization (Intel VT-x / AMD-V, exposed to Linux via KVM). The workload runs against a real Linux kernel of its own — not a re-implementation — and the only way out to the host is through a minimal virtio device model (virtio-net, virtio-block, virtio-vsock, a serial console, and a trivial reboot controller). Firecracker is written in Rust and runs behind a jailer that chroots the process, applies cgroups and seccomp, and drops privileges as defense-in-depth.
Because the guest runs a real kernel, compatibility is essentially that of a normal Linux box — if it runs on Linux, it runs in the microVM, including programs with native dependencies, unusual syscalls, or kernel features gVisor might not implement. The boundary the host has to defend is no longer the full syscall ABI; it's the VMM plus the KVM ioctl interface plus that small virtio surface. Smaller, more heavily audited, and the reason AWS Lambda runs untrusted multi-tenant code on Firecracker. For the deeper mechanics of how a microVM boots and where its boundary ends, see /blog/what-is-a-microvm.
The historical knock on the VM rung was startup cost — but that's a solved engineering problem, not a law of physics. PandaStack restores a baked Firecracker snapshot on every create at a p50 of 179ms (about 203ms p99; roughly 49ms is the snapshot restore itself), only the very first cold boot of a template takes around 3s, and a same-host copy-on-write fork lands in roughly 400–750ms. A real guest kernel for the price of a couple hundred milliseconds is the trade Firecracker makes.
Side by side
- Boundary type — gVisor: software syscall interception by a user-space kernel. Firecracker: hardware virtualization (KVM) around a real guest kernel.
- What the host exposes — gVisor: a narrow, seccomp-filtered set of host syscalls made by the Sentry. Firecracker: the VMM + KVM ioctl interface + a minimal virtio device model.
- Guest kernel — gVisor: none; the Sentry re-implements Linux syscalls. Firecracker: a full, real guest kernel per VM.
- Compatibility — gVisor: high for ordinary code, but some syscalls are unimplemented and syscall-heavy or exotic workloads can break (verify against gVisor's docs). Firecracker: very high; it's a real Linux kernel, so if it runs on Linux it runs in the microVM.
- Performance profile — gVisor: near-container for CPU-bound code; measurable overhead on syscall- and I/O-heavy workloads. Firecracker: VM-class with a small per-guest memory cost; no per-syscall interception tax.
- Startup — gVisor: closer to a container, no guest-kernel boot. Firecracker: a guest-kernel boot, but snapshot-restore makes per-create cost sub-second in practice.
- Who runs it — gVisor: Modal uses it per its own docs; Google Cloud Run and GKE Sandbox use it. Firecracker: AWS Lambda, Fargate, E2B, Vercel, and PandaStack.
Pick gVisor when / pick Firecracker when
Pick gVisor when you want stronger-than-container isolation with a container-shaped operational model, your workloads are well-behaved (no exotic syscalls, not pathologically I/O-bound), and you're comfortable validating compatibility against its implemented surface. It's a legitimate, real step up from a plain container — not a half-measure — and for a lot of platform code it's exactly the right amount of boundary at low operational weight.
Pick Firecracker when the code is arbitrary and untrusted, compatibility has to be a non-issue (you can't have a tool silently fail because a syscall isn't implemented), and you want the boundary to be hardware virtualization rather than a software interceptor. AI agents are the textbook case: a model can decide to run literally anything — a Python script with native deps, a weird system call, a process that pounds the filesystem — and you want a real kernel under it and a small audited host surface around it. The compatibility ceiling of "it's just Linux" matters most precisely when you can't predict the workload.
For agent and multi-tenant code-execution use cases specifically, /blog/secure-code-execution-for-ai-agents and /blog/best-code-execution-sandboxes go deeper on the decision and the platforms that have made each choice.
What the microVM path looks like in practice
If you land on the Firecracker side, the ergonomics shouldn't make you feel the VM. PandaStack is an Apache-2.0 open-source platform built on Firecracker — each sandbox is its own microVM with its own guest kernel running under the jailer, created via snapshot-restore and forkable copy-on-write. One call gives you a hardware-isolated environment:
from pandastack import Sandbox
# One Firecracker microVM — its own guest kernel under KVM, created via
# snapshot-restore (~179ms p50). Real Linux inside, so compatibility is
# "if it runs on Linux, it runs here."
with Sandbox.create(
template="code-interpreter", # python + node scientific stack
ttl_seconds=300, # reaped automatically if abandoned
metadata={"task": "agent-tool-call"},
) as sb:
result = sb.exec("python3 -c 'import statistics; print(statistics.mean([2,4,6,8]))'")
print(result.stdout, result.exit_code)
# Context manager destroys the microVM here — nothing survives to the next task.The SDK reads PANDASTACK_API_KEY from the environment; the same flow exists in the TypeScript SDK and CLI, and because the core is Apache-2.0 you can self-host the whole thing on your own Linux KVM hosts and keep the boundary on infrastructure you control.
The honest bottom line
gVisor and Firecracker are both real answers, and the choice is a genuine trade, not a winner-takes-all. gVisor shrinks the host-kernel surface with a clever user-space kernel and a near-container footprint, paying for it in a compatibility long tail and syscall-path overhead. Firecracker gives you a real guest kernel behind a hardware boundary with very high compatibility, paying for it in a small memory cost and a guest boot that snapshot-restore makes nearly free per create. For arbitrary, unpredictable, model-generated code — the AI-agent default — the hardware boundary and "it's just Linux" compatibility tip it toward Firecracker. For well-characterized workloads where a container UX matters and the syscall surface is known, gVisor earns its place. Match the boundary to your threat and your workload, and verify the workload-specific claims about either one against its own documentation.
Frequently asked questions
What's the core difference between gVisor and Firecracker?
gVisor is a user-space kernel (the Sentry, written in Go) that intercepts an application's syscalls and re-implements them itself, shrinking the host-kernel attack surface without booting a second OS — a software isolation boundary. Firecracker is a Virtual Machine Monitor that boots a real guest kernel isolated by CPU hardware virtualization (KVM) — a hardware boundary. gVisor sits between containers and microVMs on the isolation ladder; Firecracker is the microVM rung.
Is gVisor a virtual machine?
No. gVisor keeps a process model and re-implements Linux syscalls in user space; it does not boot a guest kernel. It can run on a KVM platform for faster, safer address-space switching, but even then it borrows CPU virtualization extensions rather than running a hardware VM. Firecracker, by contrast, boots a genuine guest kernel under KVM. Calling gVisor a VM is a common and consequential mistake in security reviews.
Which has better compatibility for arbitrary code?
Firecracker. Because the guest runs a real Linux kernel, compatibility is essentially that of a normal Linux machine — if it runs on Linux, it runs in the microVM, including native dependencies and unusual syscalls. gVisor re-implements the kernel and doesn't implement all of it, so some syscalls are unimplemented or behave differently and syscall-heavy or exotic workloads can break. Most ordinary code runs fine on gVisor; verify the specifics against gVisor's own docs and your workload.
Who uses gVisor and who uses Firecracker?
Per its own documentation, Modal uses gVisor; Google's Cloud Run and GKE Sandbox also use it. Firecracker powers AWS Lambda and Fargate, and is the isolation primitive behind E2B, Vercel, and PandaStack. Both are production-proven; the choice reflects whether a platform prioritizes a container-shaped model with a smaller host-kernel surface (gVisor) or a hardware boundary with a real guest kernel and very high compatibility (Firecracker).
Which should I use for an AI agent that runs untrusted code?
For arbitrary, model-generated code the microVM rung (Firecracker) is the safer default: an agent can decide to run anything, so you want a real guest kernel under it, a small audited host surface around it, and compatibility that doesn't silently fail when a syscall isn't implemented. gVisor is a legitimate stronger-than-container option for well-behaved, well-characterized workloads. Either way, neither boundary is unbreakable — pair it with network egress controls, resource limits, and short-lived environments.
49ms p50 cold start. Fork, snapshot, and scale to zero.