Operations
Cost Model
An estimate of what the per-customer-cluster architecture costs to run. The assumptions are placeholders — edit them live as we get real numbers from Azure, GCP, and AWS.
These numbers are assumptions, not quotes
Interactive estimate
Adjust any assumption to see the per-customer and fleet-wide impact update immediately. Org size and peak concurrency drive the LiveKit and sim-engine node counts; managed services are monthly estimates and egress is derived from concurrency.
Assumptions
Org usage & demand
Capacity per node (from load tests)
Compute pricing
Managed services (monthly)
Network & commercial
Derived demand & node counts
Infra / customer / mo
$1,037
Infra / customer / yr
$12,439
Gross margin / customer
$22,561
64% of license
Annual license
$35,000
Monthly breakdown / customer
Fleet projection (25 customers)
Annual infra cost
$310,980
Annual revenue
$875,000
Annual gross margin
$564,020
How the model is built
The model is demand-driven: org size and peak concurrency determine how many concurrent sessions and participants a customer generates, which size two independent node pools.
- Demand — peak concurrent users = total users × peak-concurrency %. Concurrent sessions = peak users ÷ avg users per session; concurrent participants also count the AI agents in each session.
- LiveKit nodes — concurrent participants ÷ participants-per-node capacity. Media (bandwidth/CPU) is the bottleneck, and host networking means one LiveKit pod per node, so capacity is added in whole nodes.
- Sim / agent nodes — concurrent sessions ÷ sessions-per-node capacity. The sim engine is light per session, so these pack far denser than LiveKit and scale independently.
- Managed services — the Kubernetes control plane, Redis, load balancer/ingress, storage and logs, and the observability stack, each as a monthly figure.
- Network egress — derived from concurrent participants × media bitrate × monthly usage hours × the cloud's per-GB price (the most usage-sensitive line, and the one to watch as concurrency grows).
Gross margin per customer is the annual license minus annual infra cost; the fleet projection multiplies by the customer count. As real telemetry arrives, the per-node capacities, media bitrate, and peak concurrency are the inputs most worth revisiting — they swing the node counts that dominate cost.
What drives cost up or down
- Org size & concurrency — a small medical school with low peak concurrency may need only the HA-floor nodes, while a large hospital's peaks add LiveKit and sim nodes. This is why the model is keyed to users and concurrency, not fixed node counts.
- Idle baseline — a dedicated cluster costs money even with no active sessions. Scale-to-low node pools and right-sized control planes keep the floor down between sessions.
- Concurrency peaks — more simultaneous sessions add LiveKit and worker nodes; autoscaling means you pay for peaks, not 24/7 maximums.
- Egress — audio/video and high-frequency transforms scale with participants and session hours; this is the line most likely to surprise at scale.
- Cloud choice — node, egress, and managed-service prices differ across Azure, GCP, and AWS; the model is per-customer, so a customer's required cloud sets their cost basis.