Platform Design
Content & Versioning Model
A lesson is the core, portable unit of the platform. This page captures how lessons are authored, versioned, distributed to tenants, and kept compatible across web, VR, and game clients — and across evolving AI services.
Status: living draft (in progress)
The core reframe: independently versioned planes
Today everything ships as one blob, so any change is a forced change. The “forced upgrade” pain is a coupling problem, not an isolation problem. The fix is to split “the product” into independently versioned axes, each with its own cadence and its own pinning:
- Platform / runtime version — The APIs, sim engine, LiveKit, and the web / VR / game clients.
- Content schema version — The format and contract a lesson is authored against.
- Content package version — The actual lesson bytes — both our upstream library and the tenant's customized fork.
- Tenant pin / channel — What each customer has chosen to take (pinned version or release channel).
Once these are separate, “the customer takes an upgrade” becomes an explicit, per-axis version bump — never a fleet-wide forced rollout. The reference experience is a self-updating desktop app: it ships often, you choose when to take it, and it doesn't break your existing work because the new runtime stays backward-compatible within a major version.
Two planes, joined at a seam
Every change is one of two kinds, and they have very different rules.
Editorial / content plane
Text, objectives, assessment checklists, lab images, vital sets, patient prompts. Reviewable, mergeable, and never breaks the runtime. Largely immutable once published; low / no PHI.
Capability / runtime plane
New features, bug fixes, breaking changes between content and clients / services. Governed by capability negotiation; can be breaking, so changes here are version-gated.
The seam
A lesson revision declares the capabilities it depends on. An edit that starts using a new capability is both a content revision and a bump in the lesson's capability requirement.
That declaration is what lets the runtime answer “yes / no, I can run this” instead of breaking silently in front of a learner.
Distribution: the package-manager model
Treat the default library exactly like an upstream open-source package, and the tenant's customization like a fork with a lockfile.
- Publish — We publish versioned content packages to a registry with semantic versions, e.g. scenario-sepsis@2.4.1.
- Fork — Tenants fork to customize; they now own a downstream branch of that scenario.
- Sync from upstream — When we ship 2.5.0, the tenant gets a 'sync from upstream' prompt — they choose to merge, with conflict resolution against their customizations.
- Pin & channel — Tenants pin versions and subscribe to a release channel (stable / beta / pinned). Our release cadence is decoupled from their adoption.
This single pattern solves both stated pains at once: customer-controlled upgrades and content versioning that preserves customization.
The compatibility contract
Borrowed from package.json engines / capability negotiation — this is what stops a lesson from breaking when services change:
- Each content package declares what it requires:
requires: { schema: "^3", capabilities: ["sim.vitals.v2", "branching.v1"] }. - Each runtime / client advertises the capabilities it provides.
- A lesson only launches if the contract is satisfiable; otherwise the tenant sees a clear “this lesson needs runtime ≥ X” message instead of a silent failure.
- Breaking change = major version bump. Tenants on the old major keep working; upgrading is an explicit, migration-gated action.
Why this matters doubly for VR / game clients
Authoring UX: Wikipedia over git
The versioning must feel familiar to clinician-authors, not developers. The key is to separate storage semantics from presentation.
Underneath: git-like
Immutable revisions, content-addressing, branch / merge, full lineage. Never exposed to the user.
On top: Wikipedia-like
Page history, “who changed what and when,” side-by-side diffs, one-click revert, “an update is available.” Zero version-control literacy required.
Structured, not blobs
Because a lesson is structured fields, diffs are semantic (“Vitals: HR 92 → 110”), giving granularity and friendliness at once.
The upstream merge becomes a field-level three-way merge, presented gently: “We updated 3 sections of the library scenario. You've customized 1 of them — review that one; we'll auto-apply the other 2.”
One store, three consumers
Standards posture: clean core, standards at the edges
The internal content model stays clean; interoperability standards live at the boundaries as adapters.
| Standard | Role | Where it lives |
|---|---|---|
| xAPI | Universal tracking spine (actor–verb–object → LRS) across every modality | Core |
| cmi5 | Launch & packaging contract; Assignable Units launchable in any client (web, VR, game) | Core |
| SCORM | Export / import interop with customers' existing LMSs | Edge adapter |
| LTI 1.3 / Advantage | Connect into a customer LMS: deep linking, roster, grade passback | Edge adapter |
SCORM's tracking model is dated and tightly coupled to a browser + LMS runtime, so it must not shape the core — cmi5 & xAPI fit the multi-client world far better.
Composition hierarchy
A multi-modal “lab” — videos → reading → quiz → sim scenario → debrief → quiz — is just a sequenced lesson of mixed activities. It maps onto cmi5's hierarchy almost 1:1:
└─ Lesson / Module (cmi5 "Block")
├─ Activity: video (cmi5 "AU")
├─ Activity: reading
├─ Activity: quiz
├─ Activity: sim scenario
└─ Activity: debrief
Each Activity is independently versioned, capability-declaring, and launchable on the appropriate client. Sequencing / gating rules (must-pass-quiz-before-sim) live at the lesson level. No new primitive is needed for a “lab” — a good sign the model is right.
AI provider abstraction (per-lesson, per-agent)
AI services change constantly, so a lesson must depend on capabilities, not model strings. The config tree is Lesson → Scenario → Agent (patient) → { llm, tts, stt }, each independently selectable.
The canonical case: ElevenLabs v2 → v3
[whispers] but removes the old character-escaping behavior — an additive capability and a breaking input-contract change in one release. A lesson pinned to v2 keeps working forever; opting into emotion tags means opting into a capability (tts.emotion-tags.v1), which implies a v3-class provider and a new escaping contract. The rule flip lives in a versioned provider adapter, not scattered through content.Selection should be constraint-driven to avoid runtime footguns (e.g. Deepgram strong at STT for one language, ElevenLabs better at TTS for another):
- Author picks language + desired features for the agent.
- System shows only compatible providers / models (capability profiles filtered by that constraint).
- Author pins one or accepts the default.
The LLM side can ride a provider-neutral SDK (e.g. Vercel AI SDK); TTS / STT use a thin equivalent with the same capability-profile shape (languages, streaming, latency class, features).
BYOK: declare, supply, resolve
The rule that keeps lessons portable and credentials secure:
Lessons declare
A lesson references providers / models abstractly (provider: elevenlabs, capability: emotion-tags-v1). Never contains keys, endpoints, or tenant specifics — so it stays shareable, exportable, and groundable.
Tenants supply
BYOK credentials (API tokens, custom endpoint URLs) live in a tenant-scoped, encrypted, strongly-isolated secret store — in the runtime / PHI plane, never in content packages.
Runtime resolves
At launch, the runtime binds abstract provider refs to concrete tenant credentials. Default to vendor-managed keys, with BYOK as an override (differs for billing, quota, and the HIPAA BAA chain).
Localization is a first-class axis
Localization runs through three distinct layers — don't collapse them:
- UI chrome — Portal labels; classic i18n (ICU message catalogs), per-user locale.
- Content — Per-lesson locale variants within the structured model; versioning works per-locale, so 'the English changed' can flag 'the French is now stale.'
- AI capability + prompts — Language constrains valid LLM / TTS / STT providers (the support gap problem) and prompts often need per-language tuning, not literal translation.
Language is therefore a first-class axis woven through content variants, provider selection, and prompt authoring — retrofitting locale-aware versioning later is brutal, so it's designed in from the start.
The isolation payoff
This reframing answers the tenancy question by splitting data into two planes with different isolation needs:
| Plane | Contents | Isolation strategy |
|---|---|---|
| Content / catalog | Vendor upstream + tenant forks; versioned, largely immutable, low / no PHI | Pooled, CDN-distributable |
| Runtime / results | Learner identity, attempts, xAPI statements, sim recordings, BYOK secrets — PHI lives here | Strong (silo / bridge) isolation |
So we likely don't need one isolation model — we need a different one per plane.
Open questions
- Forking semantics — when a tenant customizes, is it a copy, an override layer, or a true branch? This drives how painful upstream-merge becomes.
- Standalone-ness — must a lesson be a fully offline-runnable bundle (important for VR / game clients), or may it call home at runtime?
- Authoring scope — do tenants author net-new content, only customize ours, or is there a co-author / marketplace ambition?
- Merge UX — exactly how “take the library update” looks to a non-technical author.
- Launch-resolution flow — capability check → provider / key binding → xAPI emission, end to end.