all posts

Isolating Tenants in a Multi-Tenant SaaS with microVMs

Ajay Kumar··9 min read

Every multi-tenant SaaS eventually meets the same two ghosts: the noisy neighbor and the leaked row. The noisy neighbor is the tenant whose runaway report query, 40GB upload, or accidental fork bomb degrades the experience for everyone else on the box. The leaked row is the bug — an off-by-one in a `WHERE tenant_id = ?` clause, a forgotten filter in a new endpoint, a cache keyed without the tenant prefix — that shows one customer another customer's data. The first is a reliability problem. The second is a breach, a churned enterprise account, and possibly a regulator's email. Both come from the same root cause: tenants sharing something they shouldn't.

I'm Ajay — I build PandaStack, a Firecracker microVM platform, so I spend most of my time thinking about exactly where the isolation boundary sits. This post is about the spectrum of choices you have, from "one shared process with a tenant_id column" up to "a hardware-isolated microVM per tenant," and how to pick the right rung for the workload in front of you. The honest answer is that most SaaS apps want a blend, not a single dogma.

Why a shared-kernel container is not a tenant boundary

The default instinct is to reach for containers. They're cheap, they start fast, and "one container per tenant" sounds isolated. It isn't — not at the security layer. Every container on a host shares that host's single Linux kernel. Namespaces give a container a private view of processes, mounts, and network; cgroups cap how much CPU and memory it can use. But the kernel itself — the largest, most attack-surfaced piece of code on the machine — is shared by all of them.

That matters in two directions. For data isolation, a container escape (a known and recurring class of bug) or a kernel vulnerability lets one tenant's code reach the host and, from there, every neighbor. A shared kernel is one tenant's bad day away from being everyone's bad day. For the noisy-neighbor problem, cgroups help but leak: kernel-level contention — page cache thrash, lock contention, a fork storm, network buffer exhaustion — isn't cleanly partitioned, so a misbehaving tenant can still drag the shared kernel's performance down for everyone sharing it.

If the workload you're isolating is untrusted or runs tenant-supplied code — AI agents, code execution, plugins, custom build steps — a container is not a tenant boundary. You're trusting a shared kernel to hold the line against code you didn't write.

Giving each tenant its own microVM

A microVM moves the boundary down to hardware. Each Firecracker microVM boots its own guest kernel and is confined by KVM hardware virtualization; the only way out is a tiny set of emulated virtio devices, not the full Linux syscall surface a container sees. This is the same model AWS Lambda and Fargate use to run untrusted code from thousands of customers on shared fleets. Give each tenant — or, more precisely, each unit of untrusted or contention-prone work a tenant generates — its own microVM, and a crash, an exploit, or a resource hog is contained to that one VM. Its memory, filesystem, and network namespace are its own. The blast radius is one disposable machine.

The classic objection to per-tenant VMs is startup cost: nobody wants to wait three seconds to boot a VM on a tenant's first request. PandaStack sidesteps that with snapshot-restore — every create restores a baked snapshot of an already-booted machine rather than cold-booting. The restore step is around 49ms; an end-to-end create is p50 179ms and p99 about 203ms (a true cold boot, only on the very first spawn of a template, is around 3s). That's the trade-off that makes per-tenant VM isolation practical: you get hypervisor-grade isolation at roughly container-grade latency.

from pandastack import Sandbox

def run_for_tenant(tenant_id: str, code: str) -> dict:
    """Run a tenant's untrusted workload in its own microVM, then destroy it."""
    # One VM per tenant request. No shared kernel, no shared filesystem.
    with Sandbox.create(
        template="code-interpreter",
        ttl_seconds=600,                 # backstop so a leaked VM reaps itself
        metadata={"tenant_id": tenant_id},
    ) as sbx:
        sbx.filesystem.write("/workspace/job.py", code)
        result = sbx.exec("python3 /workspace/job.py", timeout_seconds=60)
        return {
            "tenant": tenant_id,
            "exit_code": result.exit_code,
            "stdout": result.stdout,
            "stderr": result.stderr,
        }
    # sandbox (and everything the tenant touched) is gone here

The shape that matters: one tenant, one VM, killed when the work is done. Tag the sandbox with `metadata={"tenant_id": ...}` so your audit logs and billing can attribute usage per tenant. For a long-running per-tenant workspace (a hosted notebook, a per-customer build environment), pass `persistent=True` and call `sbx.kill()` yourself when the tenant churns — but never reuse one persistent sandbox across two tenants, because the VM is the boundary and sharing it erases the boundary.

Capacity isn't usually what limits per-tenant VMs. A PandaStack agent pre-allocates 16,384 /30 subnets, so each VM gets its own network namespace; the practical ceiling is host memory and CPU, not networking. Density math, below, makes that concrete.

The cost and density math at scale

"A VM per tenant" sounds expensive until you do the arithmetic, and the arithmetic depends entirely on whether your tenants' VMs are active all the time or only when they're doing something. Two regimes:

  • Always-on per-tenant VMs — if every tenant needs a VM running 24/7, density is just (host RAM ÷ per-VM RAM). A 2 GiB tenant VM on a 256 GiB host gives ~128 always-on tenants per host, minus headroom. This is the expensive regime, and it's the one most cost comparisons assume.
  • On-demand per-tenant VMs — if a tenant only needs a VM while a request is in flight, you don't keep one running. You create on demand (p50 179ms), run the work, and kill it. A host that can hold 128 concurrent VMs now serves far more tenants, because most tenants are idle at any instant and pay nothing while idle.
  • Hibernate between bursts — for sessions that come in waves (an interactive workspace a tenant pokes at all day), snapshot memory + disk and stop the VM between bursts, then auto-wake on the next request. Idle cost drops toward zero while keeping warm-ish resume.
  • Fork from a warm parent — when many tenants start from the same configured baseline, snapshot one set-up VM and fork it: a same-host fork is 400–750ms (cross-host 1.2–3.5s) and shares memory copy-on-write, so a hundred forks don't cost a hundred full copies of RAM.

The structural advantage of microVMs over always-on containers-per-tenant is that an idle microVM can cost essentially nothing — you delete or hibernate it and recreate it in well under 200ms. The economics flip from "pay for every tenant's peak, all the time" to "pay for the work actually happening." That's the same insight that makes serverless cheap, applied to the isolation boundary itself. (For a deeper treatment of the unit economics, see the microVM density and economics write-up.)

When row-level security is enough — and when it isn't

Per-tenant microVMs are not the right answer for every part of your app. For your core CRUD data — the rows behind your dashboards, the records your trusted application code reads and writes — a shared database with row-level security (RLS) is usually the correct, cheaper, simpler choice. Postgres RLS enforces tenant scoping in the database itself, so a forgotten `WHERE` clause in application code can't leak across tenants; the policy is the backstop. Shared-schema multi-tenancy with RLS scales to large tenant counts on a single database and is the workhorse of most SaaS.

RLS is a boundary inside trusted code operating on trusted data. It stops application bugs from crossing tenants. What it does not do is contain untrusted execution, and it does not give a heavy tenant a separate machine. So the line is about trust and contention, not about how important the data is:

  • Use shared-DB row-level security when — your application code is trusted, the workload is ordinary CRUD/queries, tenants are numerous and mostly small, and you want one operational surface. RLS is the default; reach past it only with a reason.
  • Use VM-level isolation when — you execute tenant-supplied code (AI agents, plugins, code interpreters, custom builds), a tenant can run arbitrary queries or jobs that starve neighbors, compliance demands a hard physical-ish boundary, or a single enterprise tenant's load justifies dedicated capacity.
  • Use both, at different layers — shared-DB RLS for the core app, plus a per-tenant microVM for the untrusted-execution feature bolted onto it. Most SaaS that adds an "agent" or "run code" feature lands here: the boundary you need for the new feature is stricter than the one your CRUD data needs.

A useful test: if a tenant could, in principle, write the code that runs, you need a VM boundary, because RLS and your application logic are downstream of that code. If only your trusted code touches the data, RLS is enough.

Per-tenant database isolation: the strongest data boundary

There's a rung above shared-DB RLS for data: give each tenant its own database. The strongest version of this is a dedicated database VM per tenant — the tenant's data lives on a separate Firecracker microVM with its own durable volume, not just a separate schema on a shared engine. That gives you the cleanest possible blast radius (one tenant's database can't be queried by another tenant's, full stop), per-tenant backup/restore and point-in-time recovery, the ability to relocate or delete a single tenant wholesale, and no noisy-neighbor contention at the storage engine. On PandaStack a managed database is exactly this: a microVM plus a durable volume, created in 30–90s (the bootstrap blocks until Postgres is genuinely ready).

# Provision a dedicated, isolated PostgreSQL VM for a single enterprise tenant.
# Each database is its own Firecracker microVM + durable volume.
curl -sS -X POST https://api.pandastack.ai/v1/databases \
  -H "Authorization: Bearer $PANDASTACK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"label": "tenant-acme-prod"}'

# Response includes a TLS-only connection_url scoped to this tenant alone:
#   postgres://pandastack:<pw>@<id>.db.pandastack.ai:5432/pandastack
# Store it against the tenant; nothing else can route to this VM.

The trade-off is real: database-per-tenant multiplies your operational surface (migrations, monitoring, and upgrades now fan out across N databases) and costs more than a shared engine, so it's overkill for thousands of small tenants. It shines for a smaller set of high-value or compliance-bound tenants where "your data is on its own machine" is a feature you can sell. The common pattern is a hybrid: pooled shared-DB RLS for the long tail of small tenants, and a dedicated database VM for the enterprise accounts that demand (and pay for) physical separation. (There's a dedicated post on per-tenant database isolation that goes deeper on the pooled-vs-isolated spectrum.)

Putting it together

Tenant isolation isn't one decision; it's a stack of them, each at its own layer. For trusted CRUD data, shared-DB row-level security is the right default. For untrusted execution — anything where a tenant supplies the code or can run resource-hungry jobs — push the boundary down to a Firecracker microVM, one per tenant unit of work, and let snapshot-restore keep the latency honest. For high-value tenants who need a hard data boundary, give them a dedicated database VM. The mistake to avoid is the comfortable middle: assuming a shared-kernel container drew the line for you. It didn't — it just made the line invisible until the day a tenant finds it.

Frequently asked questions

Is a container enough to isolate tenants in a multi-tenant SaaS?

For trusted first-party workloads, yes — containers are fine for packaging and running your own services. But containers share the host's single Linux kernel, so they are not a security boundary against untrusted or tenant-supplied code: a container escape or kernel bug can reach the host and every neighbor. If tenants can run their own code (agents, plugins, code execution), isolate them with a hardware-virtualized microVM instead, where each tenant gets its own guest kernel under KVM.

When is row-level security enough, and when do I need VM-level isolation?

Row-level security in a shared database is enough when your application code is trusted and the workload is ordinary CRUD — RLS enforces tenant scoping in the database so a forgotten WHERE clause can't leak data across tenants. It is not enough when tenants execute their own code or can run jobs that starve neighbors, because RLS and your app logic run downstream of that code. The test: if a tenant could write the code that runs, you need a VM boundary; if only your trusted code touches the data, RLS suffices. Many SaaS apps use both at different layers.

How do per-tenant microVMs handle the noisy-neighbor problem?

Each Firecracker microVM has its own guest kernel and its own memory, filesystem, and CPU allocation enforced by hardware virtualization, so kernel-level contention — page cache thrash, lock contention, fork storms — is contained to one tenant's VM instead of leaking across a shared kernel the way it can with containers. A tenant's runaway query or resource hog degrades only their own VM, not their neighbors'.

Isn't a microVM per tenant too expensive at scale?

Only if every tenant needs a VM running 24/7. The economics flip when you create VMs on demand: PandaStack restores a baked snapshot in around 49ms with a p50 create of 179ms, so you can spin up a tenant's VM per request, run the work, and kill it — idle tenants cost nothing. For bursty sessions you hibernate between bursts (idle cost near zero, auto-wake on the next request), and when many tenants share a baseline you fork from a warm parent in 400–750ms with copy-on-write memory. That turns per-tenant isolation from "pay for every tenant's peak" into "pay for the work actually happening."

Run code in a microVM in one API call.

49ms p50 cold start. Fork, snapshot, and scale to zero.

Start free
Written by Ajay Kumar, Founder, PandaStack.