Firecracker Use Cases: Who Runs MicroVMs and Why
Firecracker is a Virtual Machine Monitor that runs microVMs — minimal hardware-isolated guests that boot in milliseconds. That's the what. The more useful question is who runs it and why, because the answer tells you whether it's the right tool for your problem. The honest pattern: Firecracker shows up wherever someone has to run code they don't fully trust, at scale, with hard isolation between tenants — and it's overkill almost everywhere else. This post walks the real adopters, the workload shapes that justify a microVM, and (the part nobody likes to write) when reaching for one is a mistake.
I'm Ajay — I built PandaStack, an open-source platform on Firecracker — so I have a horse in this race. I'll keep the numbers honest and the competitors qualitative. If you want the layer-by-layer mechanics first, the companion pieces on what a microVM is ("/blog/what-is-a-microvm") and Firecracker vs Docker ("/blog/firecracker-vs-docker") cover that ground; this one is about the use cases.
The origin story: AWS Lambda and Fargate
Firecracker wasn't a research curiosity that found a job later — it was built at AWS for a job it already had. Lambda runs untrusted, customer-supplied functions from millions of accounts on shared fleets. The original design used one EC2 instance per customer for isolation, which is safe and ruinously wasteful: a function that runs for 80ms doesn't justify a whole VM sitting idle. AWS needed VM-grade isolation with container-grade density and startup. Containers alone share the host kernel, which is not a boundary you want between two strangers' code. Full VMs were too heavy. So they wrote a new VMM in Rust, stripped to the few virtio devices a serverless guest actually needs, and open-sourced it in 2018.
Today both AWS Lambda and AWS Fargate run on Firecracker. Every function invocation and every Fargate task lands in its own microVM with its own guest kernel. That's the canonical use case in one sentence: many tenants, untrusted code, per-request lifecycle, and a hard isolation line between every neighbor. If your problem rhymes with that, Firecracker is probably in your future. (AWS has published this architecture in their own papers and docs — worth reading the source if you want the gory detail.)
The modern wave: AI code sandboxes
The biggest new driver of Firecracker adoption is the one nobody saw coming in 2018: LLMs that write and run code. An agent that generates a shell command or a Python snippet is, definitionally, producing untrusted code at runtime — you can't review it before it executes, and a prompt injection can turn "plot this CSV" into "read every env var and exfiltrate it." That is the exact threat model Firecracker was built for, which is why a whole category of AI-sandbox vendors landed on it.
Several of the prominent players in this space describe Firecracker (or Firecracker-derived microVMs) as their isolation substrate in their own documentation: E2B, Vercel Sandbox, and Fly Machines / Fly Sprites among them. The shapes differ — some optimize for code-interpreter sessions, some for general compute, some for agent loops — but the common thread is microVM-per-session isolation so one user's code can't reach another's. I'm describing these qualitatively on purpose; verify the current architecture against each vendor's own docs, because these systems evolve fast and I'd rather you trust their source than my paraphrase. If you're weighing options, the sandbox roundup ("/blog/best-code-execution-sandboxes") and the Fly Sprites comparison ("/blog/pandastack-vs-fly-sprites") go deeper.
Why Firecracker specifically for AI workloads, beyond isolation? Two properties matter. First, snapshot-restore: you can boot a machine once, freeze it, and restore that frozen state per request in tens of milliseconds, which makes per-invocation VMs economically sane. Second, copy-on-write fork: you can branch a running VM's memory and disk cheaply, which maps beautifully onto agent patterns like "try five fixes from the same starting state and keep the one that passes." Those two tricks are what turn "a VM per agent action" from a joke into a product.
Ephemeral CI runners
CI is a quietly perfect fit. A CI job runs arbitrary code from arbitrary branches — including pull requests from forks you've never seen — with full network access and your secrets within reach. Reusing a runner across jobs is how one poisoned build step leaks credentials into the next. The clean model is a fresh, isolated machine per job that is destroyed afterward, leaving nothing behind. Containers get you part of the way; a microVM gets you the hard kernel boundary that makes "untrusted PR runs on my infra" defensible. Several self-hosted-runner and build-isolation systems use Firecracker for exactly this reason. The deeper write-up lives in the ephemeral CI piece if you want it.
Multi-tenant SaaS isolation
The broadest bucket is multi-tenant SaaS that runs per-customer logic: code execution products, notebook/data-analysis backends, per-tenant database instances, automation platforms that run customer-defined workflows. Once you're executing one customer's code or hosting one customer's data next to another's, "we use containers and hope" stops being a compliance answer. A microVM per tenant (or per session) turns isolation into a property of the architecture rather than a property of your luck with kernel CVEs. The trade-off you're buying: a few MB of per-guest overhead and a VMM to operate, in exchange for a blast radius of exactly one VM.
Self-hosted platforms (including PandaStack)
The newest shape is teams who want the AWS-Lambda isolation model on their own hardware, without renting it by the request. That's where self-hosted Firecracker platforms come in — PandaStack is the one I work on, and it's Apache-2.0 open source, so this isn't a paywall pitch. The model: every sandbox, managed database, and hosted app is its own Firecracker microVM, and you run it on any host with /dev/kvm — a bare-metal box, a nested-virt cloud VM, even a Mac via Lima for local dev.
The performance comes from the same two tricks the big platforms use. Every create restores a baked snapshot on demand rather than cold-booting, so a sandbox comes up in 179ms at p50 (around 203ms at p99, with the snapshot-restore step itself roughly 49ms); only the very first spawn of a template pays the full cold boot of about 3 seconds. Forking a running VM is copy-on-write: around 400–750ms same-host, 1.2–3.5s cross-host. And because networking is pre-allocated, a single agent host carries 16,384 /30 subnets — that's the per-host ceiling on concurrent microVMs, and it's the subnet space, not the VMM, that sets it. On top of that substrate sit managed Postgres, git-driven app hosting, and serverless functions, each landing in its own microVM.
Spinning up an isolated VM looks like this with the Python SDK — note there's no VM boilerplate, the snapshot-restore happens under the hood:
from pandastack import Sandbox
# Each create restores a baked snapshot on demand -> ~179ms p50.
with Sandbox.create(template="code-interpreter", ttl_seconds=300) as sbx:
result = sbx.exec("python3 -c 'print(2 ** 10)'", timeout_seconds=30)
print(result.exit_code, result.stdout) # 0 1024
# the microVM is destroyed on block exitIf self-hosting is the angle you care about, the self-hosted sandbox guide and the fast-boot internals ("/blog/how-firecracker-boots-fast") explain how the snapshot path actually works.
Use case -> why Firecracker, at a glance
- Serverless functions (Lambda/Fargate) — many tenants, untrusted code, per-request lifecycle; needs VM isolation at container density. The original use case.
- AI agent code execution — model-generated code is untrusted by definition; snapshot-restore makes a fresh VM per action affordable, and CoW fork enables branch-and-explore.
- Code interpreters / data-analysis backends — run user-supplied Python with charts and files; a microVM contains rm -rf / and runaway loops to a disposable guest.
- Ephemeral CI runners — arbitrary branch/PR code with secrets in reach; fresh VM per job, destroyed after, so nothing leaks forward.
- Multi-tenant SaaS that runs customer logic — turns isolation into an architectural property instead of a bet against kernel CVEs.
- Per-customer databases — a managed DB per tenant in its own VM keeps one customer's data and noisy queries off another's.
- Self-hosted sandbox platforms — Lambda-grade isolation on your own /dev/kvm hosts, on your cost basis.
When Firecracker is the wrong choice
Being honest about where a tool doesn't fit is how you earn trust about where it does. Firecracker is a precision instrument for one problem; swing it at the wrong nail and you've added a VMM to your stack for nothing. Here's where I'd talk you out of it.
Trusted first-party services
If you control the code — your own microservices, your build steps, your internal tools — the kernel-sharing risk is yours to accept, and containers give you a faster, simpler, better-tooled path. Wrapping your own well-behaved service in a microVM buys you isolation from a threat that doesn't exist (yourself) while costing you operational complexity that very much does. Use Docker. Run with Firecracker only when the code crosses a trust boundary.
GUI and desktop workloads
Firecracker deliberately ships a minimal device model — a handful of virtio devices and not much else. There's no GPU, no display, no USB, no sound. That minimalism is the security feature: a smaller device surface is a smaller attack surface. But it also means anything needing a real desktop, accelerated graphics, or rich peripherals is the wrong fit. For browser automation, desktop apps, or GUI testing you want a full VM (QEMU/KVM) or a different abstraction entirely. Firecracker is for headless compute, full stop.
When you need full hardware passthrough
Need PCIe passthrough, direct GPU access for training, specialized accelerators, or arbitrary kernel modules and device drivers? That's outside Firecracker's design envelope by intent. The minimal VMM that makes it fast and safe for serverless is the same thing that makes it unsuitable when you need the guest to talk to real hardware. Reach for a general-purpose hypervisor. Trying to bolt hardware passthrough onto Firecracker is fighting the tool's entire reason for existing.
The takeaway
Firecracker's adopters all share one shape: untrusted or multi-tenant code, run at a scale where a full VM per unit is too heavy and a shared kernel is too risky. AWS proved it with Lambda; the AI-sandbox vendors revived it for agents; CI systems and self-hosted platforms like PandaStack carry it forward. If your workload fits that shape, the microVM is the boring correct answer. If it doesn't — if you trust the code, or you need a GPU and a display — the boring correct answer is something else, and reaching for Firecracker anyway is just resume-driven infrastructure. Match the tool to the trust boundary, and the choice makes itself.
Frequently asked questions
What is Firecracker used for?
Firecracker runs minimal, hardware-isolated microVMs and is used wherever someone must run untrusted or multi-tenant code at scale: AWS Lambda and Fargate (its original purpose), AI code-execution sandboxes, ephemeral CI runners, multi-tenant SaaS that executes customer logic, and self-hosted sandbox platforms. The common thread is a hard isolation boundary between tenants without paying full-VM startup cost.
Who uses Firecracker in production?
AWS runs Lambda and Fargate on Firecracker — it was built there for exactly that. In the AI era, several sandbox vendors describe Firecracker or Firecracker-derived microVMs as their isolation layer in their own docs (E2B, Vercel Sandbox, and Fly Machines/Sprites among them — verify against their current documentation). CI/build-isolation systems and self-hosted platforms like the open-source PandaStack also build on it.
Why did AWS build Firecracker for Lambda?
Lambda runs untrusted functions from millions of customers on shared fleets and needed VM-grade isolation with container-grade density and millisecond startup. Containers share the host kernel (not a safe boundary between strangers' code) and full VMs were too heavy to spin up per invocation. AWS wrote a minimal Rust VMM that boots a hardware-isolated microVM in milliseconds, then open-sourced it in 2018.
When should you NOT use Firecracker?
Skip it for trusted first-party code you control — containers are simpler and faster, and the isolation guards a threat that isn't there. Skip it for GUI, desktop, or graphics-heavy workloads, since Firecracker ships a deliberately minimal device model with no GPU or display. And skip it when you need hardware passthrough (PCIe, direct GPU, custom drivers), which is outside its design by intent. Use a general-purpose hypervisor for those.
Why is Firecracker good for AI agents specifically?
Model-generated code is untrusted by definition — you can't review it before it runs and prompt injection can weaponize it — which is exactly Firecracker's threat model. Two more properties seal the fit: snapshot-restore lets you boot once and restore per request in tens of milliseconds (PandaStack creates a sandbox at 179ms p50), and copy-on-write fork lets an agent branch from a single starting state cheaply (same-host fork ~400–750ms) for patterns like best-of-N exploration.
49ms p50 cold start. Fork, snapshot, and scale to zero.