Architecture

System Architecture

A simulation session is a LiveKit room. Participants connect over encrypted WebRTC; a self-hosted SFU fans media out to everyone in the room.

Session participantsClinician / facilitatorLearner(s)AI agent (future)Cloudflare DNSlk.<cloud> · turn.<cloud> (unproxied)Cloud VM · Ubuntu 24.04 · DockerCaddy (caddyl4)TLS :443 · Let's Encrypt · SNI routinglivekit-server (SFU)WebRTC media · signal :7880embedded TURNRedisstateFirewall443 · UDP mediaPulumi (IaC)provisions VM,network, DNS,secretscloud-initbootstraps theDocker stackwss / WebRTCresolves

The flow, end to end

Each participant's browser resolves the session hostname via Cloudflare DNS, then opens a secure WebSocket (wss://) to the server for signaling. Media (audio/video/data) is negotiated over WebRTC and flows through the LiveKit SFU, which selectively forwards each stream to the other participants.

Everything runs on a single cloud VM today, orchestrated by Docker Compose and bootstrapped automatically on first boot.

Components on the VM

  • Caddy (caddyl4)Terminates TLS on :443 with automatic Let's Encrypt certificates and routes by SNI to the right backend.
  • livekit-serverThe WebRTC SFU and signaling server, with an embedded TURN server for clients behind restrictive firewalls.
  • RedisLoopback-only store for LiveKit's internal state.

Why a self-hosted SFU

A Selective Forwarding Unit scales multi-party sessions far better than peer-to-peer meshes: each participant sends their media once, and the server forwards it. Self-hosting keeps sensitive healthcare sessions on infrastructure we control, and keeps us cloud-agnostic.

Why DNS is unproxied

Records point straight at the VM (no Cloudflare proxy), because the CDN proxy would break the UDP and TURN paths that WebRTC relies on. Cloudflare is used purely as a neutral DNS authority.