Platform Design
Real-Time Sim Engine
The authoritative server that owns the simulation: patient physiology, 3D positions, and timed events. Clients are thin renderers; the engine is the single source of truth and runs alongside the LiveKit rooms.
Status: living draft (in progress)
What it is, and why it exists
Today this logic lives inside the Unreal Engine application: the game server holds the state, replicates it, and clients receive updates. A core goal of this project is to lift that logic out of the game engine and onto a real-time API service.
Authoritative
The server owns world state. Clients send inputs and render; they never hold the source of truth.
Engine-agnostic
Thin clients on Unreal, Unity, React-Three-Fiber, Expo + WebGPU, or even chat / voice-only all consume the same state.
AI-buildable
Out of the game engine, the system is far easier to build with AI agents and ships without app-store friction.
How LiveKit scaling actually works
A common misconception is “one Docker container per room.” That is not how LiveKit works.
- A single
livekit-serverprocess hosts many rooms at once — a room is a lightweight in-memory construct, not a container. - Nodes form a cluster coordinated through Redis; each node publishes its load and the rooms it owns.
- A room is pinned to one node (chosen by least load); all participants in that room connect there. One room does not span nodes in open-source LiveKit.
- You scale by adding more nodes, not more containers per room.
LiveKit does not provision infrastructure
It is a game server, not a physics engine
The instinct that the engine “must be blazing fast because it ticks many times per second” is only half right. Breaking down what the server is authoritative for changes the picture:
| Workload | Frequency / cost | Who drives it |
|---|---|---|
| Physiology (vitals, event scheduling) | Low frequency, low compute (1–10 Hz) | Server |
| Human avatar transforms | High frequency, but I/O-bound fan-out | Client-originated, server relays |
| AI-driven entities (nurse walking) | A few entities, light steering | Server |
So the engine is a modest-frequency state machine plus a relay plus a few lightly-simulated entities — not a rigid-body physics solver. That is what makes TypeScript a sound choice for the entity counts medical sim actually has.
Core architecture: ECS + a fixed-timestep tick
Entity-Component-System
World state is entities (patient, avatars, equipment) composed of data components (transform, vitals, animation-state). AI-legible and the standard for this kind of simulation.
Authoritative clock
A drift-corrected, fixed-timestep loop derives sim time from a monotonic source — decoupled from wall-clock to allow pause/resume and time-scaled debrief replay.
Event scheduler
Events fire against sim time (“the patient codes at t=95s”) with accuracy far tighter than medical sim needs.
Logging every state-changing event against sim time (event sourcing, ADR-0014) gives debrief replay for free, and the same stream projects into the xAPI tracking spine from the content model.
State sync over LiveKit data channels
The engine joins each room as a server-side participant and publishes state over the same WebRTC connection clients already hold — no parallel transport, one auth model, one connectivity story.
| State | Channel | Why |
|---|---|---|
| 3D transforms (positions, poses) | Lossy / unordered | High-frequency; newest wins, like UDP game netcode |
| Discrete events (code, med delivered, attach) | Reliable | Must not be dropped or reordered |
The same room hosts media, agents, and sim
One ECS graph — authored and ticked
The most important concept: authoring and running are the same operation on the same data. There is no separate editor format and runtime format — there is one entity/component graph. Designing a scenario, spawning a defib in VR, and the live simulation are all mutations of that graph.
└─ Entities (instances) patients, staff, family, equipment, fixtures
└─ Components (data) transform · vitals · physiology · animation-state ·
interactable · attachment · authority · render-ref (prefab id)
+ Systems (logic) physiology tick · steering · scheduler · rules · state-sync
+ Scenario definition declarative initial graph + scheduled events + rules
Three properties make it customer-authorable with AI agents:
- Data-driven — Entities are instances of prefabs. 'Spawn a defib and set it on the cart' = instantiate the defib prefab, then write its transform (and an attachment to the cart). No code — just graph edits.
- Composable — An ICU with 2 patients, 3 nurses, a doctor, and a family member is just an initial set of entities. Adding a family member or ordering a medication mid-session is the same kind of edit.
- Scriptable, safely — Common logic is declarative rules / event graphs; anything needing real code runs in a sandbox with a capability-scoped API (ADR-0019), since customer- and AI-generated logic can't run unsandboxed in a multi-tenant healthcare service.
How it snaps onto the content model
The authoring model reuses machinery already decided in the Content & Versioning Model:
- A scenario is structured content (the sim activity inside a lesson), so it inherits Wikipedia-style versioning, forking, and semantic diffs.
- Prefabs are versioned content packages with declared capability requirements, so a scenario only loads prefabs its runtime and clients can support — “spawn defib” degrades gracefully instead of breaking.
- Thin clients map a prefab id → local render asset and render whatever the components say; the authoritative graph stays server-side.
So a customer's AI agent authors a scenario by generating and editing the entity/component graph plus declarative rules — the exact artifacts the engine ticks at runtime.
Tech stack: TypeScript, with a native escape hatch
The engine is written in TypeScript/Node (ADR-0015) to:
- Unify the stack — web portal, R3F 3D client, Expo mobile, LiveKit Node Agents, and Pulumi infra are all TypeScript.
- Share types end to end — content model, capability contracts, and world/session schemas are one shared package consumed by server and clients.
- Maximize AI-codegen velocity — the most training data and tooling, and the engine is meant to be built and run by AI agents.
The escape hatch
Execution & orchestration
Like LiveKit's own nodes and agent workers, the engine runs as an autoscaled worker pool (ADR-0016): each worker hosts several sessions, sessions shard across workers, and the pool scales with demand — cheaper and faster than a container per session.
Control plane
On session start, an orchestrator ensures a LiveKit room, assigns a sim session to a worker, dispatches the needed AI agents, and mints scoped tokens.
Agents as participants
AI nurse / patient / evaluator are LiveKit Agents that join the room and subscribe to sim state and audio.
Sim workers
A pool of TypeScript workers ticking authoritative worlds, sharded by session and autoscaled.
Open questions & next decisions
- Orchestration platform — Kubernetes (AKS/GKE) vs Pulumi + VM autoscaling for the node pool and worker pools.
- Session lifecycle — the exact choreography from a Sessions record to a running room + sim + agents, including open vs invite-only join.
- Tick & transform rates — concrete Hz targets for sim vs transform streams, and interpolation strategy on clients.
- Scripting surface — the shape of the sandbox API and rules format authors and AI agents target.