Event Protocol

Status

Draft · v0.2.0 · 2026-06-30

1. Design Principles

#	Principle	Rationale
E1	Every observable state change emits an event.	Auditors, marking, and replay all depend on a complete log.
E2	Events are immutable facts.	Once emitted, an event’s payload MUST NOT be mutated. Corrections are new events.
E3	Events are transport-agnostic.	The same envelope works over LiveKit data-channel, WebSocket, HTTP POST, or Kafka.
E4	Persistence, UI consumption, and marking consumption are separate concerns.	An event MAY be persisted but not shown to UI; another MAY be UI-only. The schema encodes which.
E5	Events carry enough context to be self-describing.	A consumer that missed earlier events MUST still be able to interpret any event in isolation.
E6	Commands are the inverse of events.	Commands are requests (may be rejected); events are facts (already happened).
E7	Idempotency is a first-class requirement.	Re-delivering the same event MUST NOT cause duplicate side-effects.
E8	Ordering is guaranteed within a session.	Events within a single exam session carry monotonically increasing sequence numbers.

2. Event Envelope Schema

Every event emitted by any component (bot, runtime controller, frontend) conforms to this envelope.

/**
 * Universal event envelope.
 * T is a discriminated union of all concrete event types (see §3).
 */
interface EventEnvelope<T extends ExamEvent = ExamEvent> {
  /** Globally unique event ID (UUIDv7 for time-ordered uniqueness). */
  eventId: string;

  /** Exam session this event belongs to. */
  sessionId: string;

  /** Monotonically increasing within a session. Starts at 1. */
  seq: number;

  /** ISO-8601 UTC timestamp of when the event was *generated* (not received). */
  timestamp: string;

  /** The component that produced this event. */
  source: EventSource;

  /** Discriminator for the concrete event payload. */
  type: T["type"];

  /** Concrete event payload. */
  payload: T;

  /** Optional correlation ID for linking related events (e.g., a recovery sequence). */
  correlationId?: string;

  /** Schema version for forward-compatible parsing. */
  schemaVersion: "1";
}

type EventSource = "bot" | "runtime_controller" | "frontend" | "system";

/**
 * Discriminated union of all event types.
 * Each member has a unique `type` string and a specific payload shape.
 */
type ExamEvent =
  | BotReadyEvent
  | NodeEnteredEvent
  | NodeExitedEvent
  | TranscriptDeltaEvent
  | TranscriptFinalEvent
  | ExaminerUtteranceStartedEvent
  | ExaminerUtteranceFinalEvent
  | CandidateCommandReceivedEvent
  | EvidenceSignalEvent
  | FollowUpUsedEvent
  | TransitionDecisionEvent
  | GuardrailTriggeredEvent
  | RecoveryStartedEvent
  | RecoveryResolvedEvent
  | ExamCompletedEvent
  | HesitationDetectedEvent
  | SelfCorrectionDetectedEvent;

Persistence / UI / Marking Flags

Each concrete event type declares its consumption scope:

Event	Used by UI	Persisted	Used by marking	Notes
`bot_ready`	✅	✅	❌	Signals to UI that bot is listening. Persisted for session start audit.
`node_entered`	✅	✅	✅	Core navigational fact. Marking uses it to scope evidence to rubric items.
`node_exited`	✅	✅	✅	Marks end of evidence window for a node.
`transcript_delta`	✅	❌	❌	Streaming STT partial. UI-only for live captions. Not persisted to reduce noise.
`transcript_final`	✅	✅	✅	Canonical candidate speech record. Primary marking input.
`examiner_utterance_started`	✅	✅	❌	UI shows “examiner is speaking” indicator. Persisted for timing analysis.
`examiner_utterance_final`	✅	✅	✅	What the examiner actually said. Marking checks for hint leaks, question fidelity.
`candidate_command_received`	✅	✅	❌	Logged for audit. Commands themselves don’t feed marking.
`evidence_signal`	✅	✅	✅	Core marking input. Proposed by LLM, approved by runtime.
`follow_up_used`	✅	✅	✅	Counts against budget. Marking sees follow-up count per node.
`transition_decision`	✅	✅	✅	Which path was taken and why. Marking uses for rubric routing.
`guardrail_triggered`	✅	✅	✅	Indicates policy violation. Marking may penalise or flag for review.
`recovery_started`	✅	✅	❌	UI shows recovery state. Persisted for ops monitoring.
`recovery_resolved`	✅	✅	❌	Completes recovery pair. Persisted for timing analysis.
`exam_completed`	✅	✅	✅	Terminal event. Triggers marking pipeline.
`hesitation_detected`	✅	✅	✅	Assessment-significant pause pattern. Marking uses for reasoning-process evidence.
`self_correction_detected`	✅	✅	✅	Candidate metacognitive signal. Marking uses for process-quality evidence.

3. Command Envelope Schema

Commands flow in the opposite direction: frontend → runtime controller → bot. They are requests that the runtime MAY accept, reject, or transform.

/**
 * Universal command envelope.
 * C is a discriminated union of all concrete command types (see §5).
 */
interface CommandEnvelope<C extends ExamCommand = ExamCommand> {
  /** Globally unique command ID (UUIDv7). */
  commandId: string;

  /** Exam session this command targets. */
  sessionId: string;

  /** ISO-8601 UTC timestamp of command creation. */
  timestamp: string;

  /** The component issuing the command. */
  source: CommandSource;

  /** Discriminator for the concrete command payload. */
  type: C["type"];

  /** Concrete command payload. */
  payload: C;

  /** Schema version. */
  schemaVersion: "1";
}

type CommandSource = "candidate" | "proctor" | "system" | "frontend";

type ExamCommand =
  | RepeatQuestionCommand
  | RequestClarificationCommand
  | RequestRephraseCommand
  | PauseCommand
  | ResumeCommand
  | ThinkingAloudCommand
  | RaiseHandCommand
  | ChallengePremiseCommand
  | ReviseEarlierAnswerCommand
  | ReportAudioIssueCommand
  | EndExamRequestedCommand
  | EmergencyStopCommand
  | SignalConfidenceCommand;

Command → Event Acknowledgement

Every accepted command MUST produce at least one event (typically candidate_command_received) so the audit trail is complete. Rejected commands produce a guardrail_triggered event with reason.

4. Event Definitions and JSON Examples

4.1 `bot_ready`

Emitted when the bot has loaded the exam specification, connected to the audio channel, and is ready to begin.

interface BotReadyEvent {
  type: "bot_ready";
  examId: string;
  examVersion: string;
  nodeCount: number;
  estimatedDurationSec: number;
}

{
  "eventId": "01924a6f-3c82-7a01-b5e0-44f1c2d3e4f5",
  "sessionId": "sess-2026-05-06-001",
  "seq": 1,
  "timestamp": "2026-05-06T02:00:01.123Z",
  "source": "bot",
  "type": "bot_ready",
  "schemaVersion": "1",
  "payload": {
    "type": "bot_ready",
    "examId": "exam-midterm-orals-cs201",
    "examVersion": "3.2.0",
    "nodeCount": 8,
    "estimatedDurationSec": 900
  }
}

4.2 `node_entered`

Emitted when the runtime controller transitions the session into a new node.

interface NodeEnteredEvent {
  type: "node_entered";
  nodeId: string;
  nodeKind: "question" | "scenario" | "task" | "discussion" | "warmup" | "wrapup" | "branch" | "identity_check";
  rubricItemIds: string[];
  maxFollowUps: number;
  timeBudgetSec: number;
}

{
  "eventId": "01924a6f-3c82-7a02-b5e1-44f1c2d3e4f6",
  "sessionId": "sess-2026-05-06-001",
  "seq": 3,
  "timestamp": "2026-05-06T02:00:15.456Z",
  "source": "runtime_controller",
  "type": "node_entered",
  "correlationId": "trans-001",
  "schemaVersion": "1",
  "payload": {
    "type": "node_entered",
    "nodeId": "q-explain-dijkstra",
    "nodeKind": "question",
    "rubricItemIds": ["rubric-algo-explain", "rubric-complexity-analysis"],
    "maxFollowUps": 2,
    "timeBudgetSec": 120
  }
}

4.3 `node_exited`

interface NodeExitedEvent {
  type: "node_exited";
  nodeId: string;
  reason: "completed" | "time_exhausted" | "follow_ups_exhausted" | "candidate_skip" | "candidate_skip_with_return" | "forced_transition";
  durationSec: number;
  followUpsUsed: number;
}

{
  "eventId": "01924a6f-3c82-7a03-b5e2-44f1c2d3e4f7",
  "sessionId": "sess-2026-05-06-001",
  "seq": 12,
  "timestamp": "2026-05-06T02:02:35.789Z",
  "source": "runtime_controller",
  "type": "node_exited",
  "correlationId": "trans-001",
  "schemaVersion": "1",
  "payload": {
    "type": "node_exited",
    "nodeId": "q-explain-dijkstra",
    "reason": "completed",
    "durationSec": 140,
    "followUpsUsed": 1
  }
}

4.4 `transcript_delta`

Streaming partial from STT. UI-only, not persisted.

interface TranscriptDeltaEvent {
  type: "transcript_delta";
  speaker: "candidate" | "examiner";
  text: string;
  isPartial: true;
  stability: number; // 0.0–1.0, STT confidence
}

{
  "eventId": "01924a6f-3c82-7a04-b5e3-44f1c2d3e4f8",
  "sessionId": "sess-2026-05-06-001",
  "seq": 5,
  "timestamp": "2026-05-06T02:00:20.100Z",
  "source": "bot",
  "type": "transcript_delta",
  "schemaVersion": "1",
  "payload": {
    "type": "transcript_delta",
    "speaker": "candidate",
    "text": "So the algorithm starts by, um, selecting the nearest",
    "isPartial": true,
    "stability": 0.72
  }
}

4.5 `transcript_final`

Canonical, persisted utterance. The single source of truth for what was said.

interface TranscriptFinalEvent {
  type: "transcript_final";
  turnId: string;
  speaker: "candidate" | "examiner";
  text: string;
  startTimeMs: number;
  endTimeMs: number;
  nodeId: string;
  confidence: number;
  language: string;
}

{
  "eventId": "01924a6f-3c82-7a05-b5e4-44f1c2d3e4f9",
  "sessionId": "sess-2026-05-06-001",
  "seq": 6,
  "timestamp": "2026-05-06T02:00:25.300Z",
  "source": "bot",
  "type": "transcript_final",
  "schemaVersion": "1",
  "payload": {
    "type": "transcript_final",
    "turnId": "turn-001",
    "speaker": "candidate",
    "text": "So the algorithm starts by selecting the nearest unvisited node and relaxing all its edges.",
    "startTimeMs": 18200,
    "endTimeMs": 24500,
    "nodeId": "q-explain-dijkstra",
    "confidence": 0.91,
    "language": "en"
  }
}

4.6 `examiner_utterance_started`

interface ExaminerUtteranceStartedEvent {
  type: "examiner_utterance_started";
  utteranceId: string;
  nodeId: string;
  purpose: "question" | "follow_up" | "prompt" | "bridge" | "recovery" | "closing";
}

{
  "eventId": "01924a6f-3c82-7a06-b5e5-44f1c2d3e4fa",
  "sessionId": "sess-2026-05-06-001",
  "seq": 7,
  "timestamp": "2026-05-06T02:00:26.000Z",
  "source": "bot",
  "type": "examiner_utterance_started",
  "schemaVersion": "1",
  "payload": {
    "type": "examiner_utterance_started",
    "utteranceId": "utt-002",
    "nodeId": "q-explain-dijkstra",
    "purpose": "follow_up"
  }
}

4.7 `examiner_utterance_final`

interface ExaminerUtteranceFinalEvent {
  type: "examiner_utterance_final";
  utteranceId: string;
  nodeId: string;
  text: string;
  purpose: "question" | "follow_up" | "prompt" | "bridge" | "recovery" | "closing";
  durationMs: number;
}

{
  "eventId": "01924a6f-3c82-7a07-b5e6-44f1c2d3e4fb",
  "sessionId": "sess-2026-05-06-001",
  "seq": 8,
  "timestamp": "2026-05-06T02:00:32.500Z",
  "source": "bot",
  "type": "examiner_utterance_final",
  "schemaVersion": "1",
  "payload": {
    "type": "examiner_utterance_final",
    "utteranceId": "utt-002",
    "nodeId": "q-explain-dijkstra",
    "text": "Can you explain what happens when there are negative edge weights?",
    "purpose": "follow_up",
    "durationMs": 6500
  }
}

4.8 `candidate_command_received`

interface CandidateCommandReceivedEvent {
  type: "candidate_command_received";
  commandId: string;
  commandType: string; // mirrors the CommandEnvelope.type
  accepted: boolean;
  rejectionReason?: string;
}

{
  "eventId": "01924a6f-3c82-7a08-b5e7-44f1c2d3e4fc",
  "sessionId": "sess-2026-05-06-001",
  "seq": 9,
  "timestamp": "2026-05-06T02:00:45.000Z",
  "source": "runtime_controller",
  "type": "candidate_command_received",
  "schemaVersion": "1",
  "payload": {
    "type": "candidate_command_received",
    "commandId": "cmd-001",
    "commandType": "request_clarification",
    "accepted": true
  }
}

4.9 `evidence_signal`

interface EvidenceSignalEvent {
  type: "evidence_signal";
  signalId: string;
  nodeId: string;
  turnIds: string[];        // transcript turns that support this signal
  targetIds: string[];      // evidence targets this signal addresses

  /**
   * The dimension of oral assessment this signal addresses.
   * Based on Joughin (1998) primary content type categories.
   * - knowledge_understanding: recall of facts, comprehension of meaning
   * - applied_problem_solving: "think on one's feet", clinical reasoning, critical thinking
   * - interpersonal_competence: communication skills exhibited in context
   * - intrapersonal_quality: confidence, self-awareness, reactions to stress
   * - metacognitive: self-correction, reasoning process, strategic thinking
   */
  evidenceDimension:
    | "knowledge_understanding"
    | "applied_problem_solving"
    | "interpersonal_competence"
    | "intrapersonal_quality"
    | "metacognitive";

  /**
   * Classification of the evidence.
   *
   * Extended taxonomy beyond knowledge-correctness to capture process quality.
   * Fenton (2025): oral assessments reveal "the process of learning rather than
   * the output" and allow students to "reflect on their choices and self-correct."
   *
   * - positive:           Correct and complete evidence
   * - partial:            Partially correct or incomplete
   * - absent:             No evidence for this target
   * - misconception:      Demonstrates a misunderstanding
   * - flawed_reasoning:   Right answer with incorrect justification
   * - process_positive:   Good reasoning process, regardless of final answer
   * - process_negative:   Poor reasoning process
   * - self_correction:    Candidate identified and corrected their own error
   */
  signalKind:
    | "positive"
    | "partial"
    | "absent"
    | "misconception"
    | "flawed_reasoning"
    | "process_positive"
    | "process_negative"
    | "self_correction";

  description: string;      // human-readable, e.g. "Correctly identified O(V²) complexity"
  confidence: number;       // 0.0–1.0

  /**
   * STT confidence summary for the underlying transcript turns.
   * Signal confidence is epistemically dependent on transcript quality.
   * A 0.85-confidence signal from 0.6-confidence transcripts is weaker
   * than one from 0.95-confidence transcripts.
   */
  sttConfidenceSummary: {
    min: number;
    max: number;
    mean: number;
    turnCount: number;
  };

  llmProposal: boolean;     // true = proposed by LLM, false = runtime-confirmed
}

{
  "eventId": "01924a6f-3c82-7a09-b5e8-44f1c2d3e4fd",
  "sessionId": "sess-2026-05-06-001",
  "seq": 10,
  "timestamp": "2026-05-06T02:00:50.000Z",
  "source": "bot",
  "type": "evidence_signal",
  "schemaVersion": "1",
  "payload": {
    "type": "evidence_signal",
    "signalId": "sig-001",
    "nodeId": "q-explain-dijkstra",
    "turnIds": ["turn-001"],
    "targetIds": ["tgt-algo-explain"],
    "evidenceDimension": "knowledge_understanding",
    "signalKind": "positive",
    "description": "Candidate correctly described the greedy selection strategy of Dijkstra's algorithm.",
    "confidence": 0.88,
    "sttConfidenceSummary": {
      "min": 0.91,
      "max": 0.91,
      "mean": 0.91,
      "turnCount": 1
    },
    "llmProposal": true
  }
}

4.10 `follow_up_used`

interface FollowUpUsedEvent {
  type: "follow_up_used";
  nodeId: string;
  followUpIndex: number; // 1-based
  maxFollowUps: number;
  reason: "evidence_gap" | "depth_probe" | "clarification" | "misconception_probe";
  triggerTurnId: string;
}

{
  "eventId": "01924a6f-3c82-7a0a-b5e9-44f1c2d3e4fe",
  "sessionId": "sess-2026-05-06-001",
  "seq": 11,
  "timestamp": "2026-05-06T02:00:52.000Z",
  "source": "runtime_controller",
  "type": "follow_up_used",
  "schemaVersion": "1",
  "payload": {
    "type": "follow_up_used",
    "nodeId": "q-explain-dijkstra",
    "followUpIndex": 1,
    "maxFollowUps": 2,
    "reason": "depth_probe",
    "triggerTurnId": "turn-001"
  }
}

4.11 `transition_decision`

interface TransitionDecisionEvent {
  type: "transition_decision";
  fromNodeId: string;
  toNodeId: string;
  edgeId: string;
  reason: "natural_completion" | "follow_ups_exhausted" | "time_exhausted" | "condition_met" | "candidate_skip" | "guardrail_override";
  conditionEvaluated?: string; // the condition expression that was evaluated
}

{
  "eventId": "01924a6f-3c82-7a0b-b5ea-44f1c2d3e4ff",
  "sessionId": "sess-2026-05-06-001",
  "seq": 13,
  "timestamp": "2026-05-06T02:02:36.000Z",
  "source": "runtime_controller",
  "type": "transition_decision",
  "correlationId": "trans-002",
  "schemaVersion": "1",
  "payload": {
    "type": "transition_decision",
    "fromNodeId": "q-explain-dijkstra",
    "toNodeId": "q-graph-scenario",
    "edgeId": "edge-q1-to-q2",
    "reason": "natural_completion",
    "conditionEvaluated": "evidence_coverage >= 0.7 && follow_ups_used <= max_follow_ups"
  }
}

4.12 `guardrail_triggered`

interface GuardrailTriggeredEvent {
  type: "guardrail_triggered";
  guardrailId: string;
  guardrailType: "max_follow_ups" | "forbidden_hint" | "topic_drift" | "unauthorized_scoring" | "time_budget_exceeded" | "blocked_action";
  severity: "warning" | "block";
  description: string;
  actionTaken: "event_only" | "forced_transition" | "recovery_initiated" | "exam_terminated";
  contextNodeId?: string;
}

{
  "eventId": "01924a6f-3c82-7a0c-b5eb-44f1c2d3e500",
  "sessionId": "sess-2026-05-06-001",
  "seq": 14,
  "timestamp": "2026-05-06T02:02:40.000Z",
  "source": "runtime_controller",
  "type": "guardrail_triggered",
  "schemaVersion": "1",
  "payload": {
    "type": "guardrail_triggered",
    "guardrailId": "guard-max-followup-q1",
    "guardrailType": "max_follow_ups",
    "severity": "block",
    "description": "Follow-up limit (2) reached for node q-explain-dijkstra. Forcing transition.",
    "actionTaken": "forced_transition",
    "contextNodeId": "q-explain-dijkstra"
  }
}

4.13 `recovery_started`

interface RecoveryStartedEvent {
  type: "recovery_started";
  recoveryId: string;
  recoveryType: "silence" | "unclear_answer" | "off_topic" | "anxiety" | "interruption" | "network_issue" | "repetition_loop";
  nodeId: string;
  triggerDescription: string;
}

{
  "eventId": "01924a6f-3c82-7a0d-b5ec-44f1c2d3e501",
  "sessionId": "sess-2026-05-06-001",
  "seq": 15,
  "timestamp": "2026-05-06T02:03:00.000Z",
  "source": "runtime_controller",
  "type": "recovery_started",
  "schemaVersion": "1",
  "payload": {
    "type": "recovery_started",
    "recoveryId": "rec-001",
    "recoveryType": "silence",
    "nodeId": "q-explain-dijkstra",
    "triggerDescription": "No candidate speech detected for 15 seconds after examiner question."
  }
}

4.14 `recovery_resolved`

interface RecoveryResolvedEvent {
  type: "recovery_resolved";
  recoveryId: string;
  resolution: "candidate_resumed" | "re_prompted" | "skipped_to_next" | "exam_terminated";
  durationSec: number;
}

{
  "eventId": "01924a6f-3c82-7a0e-b5ed-44f1c2d3e502",
  "sessionId": "sess-2026-05-06-001",
  "seq": 16,
  "timestamp": "2026-05-06T02:03:08.000Z",
  "source": "runtime_controller",
  "type": "recovery_resolved",
  "schemaVersion": "1",
  "payload": {
    "type": "recovery_resolved",
    "recoveryId": "rec-001",
    "resolution": "candidate_resumed",
    "durationSec": 8
  }
}

4.15 `exam_completed`

interface ExamCompletedEvent {
  type: "exam_completed";
  reason: "all_nodes_visited" | "time_total_exhausted" | "candidate_ended" | "proctor_ended" | "system_error";
  totalDurationSec: number;
  nodesVisited: string[];
  totalEvidenceSignals: number;
  totalFollowUps: number;
  guardrailTriggerCount: number;

  /**
   * Interaction quality metrics for post-hoc fairness and consistency analysis.
   * Joughin (1998) warns that reliability is threatened when interaction tends
   * toward dialogue and structure tends toward open. These metrics enable the
   * marking pipeline to assess whether the interaction was conducted fairly
   * and consistently across candidates (Akimov & Malin, 2020).
   */
  interactionMetrics: {
    candidateTurnCount: number;
    examinerTurnCount: number;
    averageCandidateResponseLatencyMs: number;
    averageExaminerFollowUpDepth: number;  // avg follow-ups per node
    probingConsistencyScore: number;        // 0.0–1.0, variance in follow-up count across nodes
    longestCandidateMonologueSec: number;   // longest uninterrupted candidate speech
  };
}

{
  "eventId": "01924a6f-3c82-7a0f-b5ee-44f1c2d3e503",
  "sessionId": "sess-2026-05-06-001",
  "seq": 50,
  "timestamp": "2026-05-06T02:15:00.000Z",
  "source": "runtime_controller",
  "type": "exam_completed",
  "schemaVersion": "1",
  "payload": {
    "type": "exam_completed",
    "reason": "all_nodes_visited",
    "totalDurationSec": 900,
    "nodesVisited": ["q-warm-up", "q-explain-dijkstra", "q-graph-scenario", "q-closing"],
    "totalEvidenceSignals": 14,
    "totalFollowUps": 5,
    "guardrailTriggerCount": 1,
    "interactionMetrics": {
      "candidateTurnCount": 12,
      "examinerTurnCount": 8,
      "averageCandidateResponseLatencyMs": 2800,
      "averageExaminerFollowUpDepth": 1.25,
      "probingConsistencyScore": 0.85,
      "longestCandidateMonologueSec": 45
    }
  }
}

4.16 `hesitation_detected`

Emitted when the runtime detects a significant candidate hesitation pattern. Unlike a pause command (candidate-initiated), this is system-detected from STT timing data. Fenton (2025) notes that oral assessments reveal “the process of learning rather than the output” — hesitation patterns are evidence of reasoning, not just operational pauses.

/**
 * Detected candidate hesitation.
 * Fenton (2025): oral assessments reveal "the process of learning rather than the output."
 * Hesitation patterns are evidence of reasoning processes.
 */
interface HesitationDetectedEvent {
  type: "hesitation_detected";
  nodeId: string;
  turnId: string;
  durationMs: number;            // how long the hesitation lasted
  context: "after_question" | "mid_response" | "before_conclusion";
  /** Whether the candidate eventually produced a response after the hesitation. */
  followedByResponse: boolean;
}

{
  "eventId": "01924a6f-3c82-7a10-b5ef-44f1c2d3e504",
  "sessionId": "sess-2026-05-06-001",
  "seq": 17,
  "timestamp": "2026-05-06T02:03:15.000Z",
  "source": "runtime_controller",
  "type": "hesitation_detected",
  "schemaVersion": "1",
  "payload": {
    "type": "hesitation_detected",
    "nodeId": "q-explain-dijkstra",
    "turnId": "turn-004",
    "durationMs": 8500,
    "context": "mid_response",
    "followedByResponse": true
  }
}

4.17 `self_correction_detected`

Emitted when the runtime detects that a candidate has corrected a prior claim. Fenton (2025) specifically identifies self-correction as a distinctive benefit of oral assessment: “in the process of explaining their reasoning and deduction, students can reflect on their choices and have the chance to self-correct.” Self-correction is a metacognitive signal that demonstrates self-monitoring and deeper understanding.

/**
 * Candidate self-correction detected.
 * Fenton (2025): students can "reflect on their choices and have the chance to self-correct."
 * Self-correction is a metacognitive signal distinct from simple error correction.
 */
interface SelfCorrectionDetectedEvent {
  type: "self_correction_detected";
  nodeId: string;
  turnIds: string[];             // the turns containing the correction
  originalClaim: string;         // what the candidate first said
  correctedClaim: string;        // what they changed it to
  confidence: number;            // 0.0–1.0
}

{
  "eventId": "01924a6f-3c82-7a11-b5f0-44f1c2d3e505",
  "sessionId": "sess-2026-05-06-001",
  "seq": 18,
  "timestamp": "2026-05-06T02:03:20.000Z",
  "source": "bot",
  "type": "self_correction_detected",
  "schemaVersion": "1",
  "payload": {
    "type": "self_correction_detected",
    "nodeId": "q-explain-dijkstra",
    "turnIds": ["turn-003", "turn-005"],
    "originalClaim": "Dijkstra's handles negative weights fine."
    "correctedClaim": "Wait, actually Dijkstra's doesn't handle negative weights — you need Bellman-Ford for that.",
    "confidence": 0.92
  }
}

5. Command Definitions and JSON Examples

5.1 `repeat_question`

Candidate asks the examiner to repeat the current question.

interface RepeatQuestionCommand {
  type: "repeat_question";
  nodeId: string;
}

{
  "commandId": "cmd-rpt-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:00:40.000Z",
  "source": "candidate",
  "type": "repeat_question",
  "schemaVersion": "1",
  "payload": {
    "type": "repeat_question",
    "nodeId": "q-explain-dijkstra"
  }
}

5.2 `request_clarification`

Candidate asks for clarification on a term or instruction.

interface RequestClarificationCommand {
  type: "request_clarification";
  nodeId: string;
  text?: string; // optional free-text from candidate
}

{
  "commandId": "cmd-clr-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:00:45.000Z",
  "source": "candidate",
  "type": "request_clarification",
  "schemaVersion": "1",
  "payload": {
    "type": "request_clarification",
    "nodeId": "q-explain-dijkstra",
    "text": "What do you mean by 'relaxing edges'?"
  }
}

5.3 `pause`

Candidate requests a temporary pause (e.g., to think).

interface PauseCommand {
  type: "pause";
  reason?: "thinking" | "personal" | "other";
}

{
  "commandId": "cmd-pause-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:01:00.000Z",
  "source": "candidate",
  "type": "pause",
  "schemaVersion": "1",
  "payload": {
    "type": "pause",
    "reason": "thinking"
  }
}

5.4 `resume`

Candidate signals readiness to continue after a pause.

interface ResumeCommand {
  type: "resume";
}

{
  "commandId": "cmd-resume-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:01:30.000Z",
  "source": "candidate",
  "type": "resume",
  "schemaVersion": "1",
  "payload": {
    "type": "resume"
  }
}

5.5 `raise_hand`

Candidate raises hand for any reason (generic attention request).

interface RaiseHandCommand {
  type: "raise_hand";
  reason?: string;
}

{
  "commandId": "cmd-hand-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:01:45.000Z",
  "source": "candidate",
  "type": "raise_hand",
  "schemaVersion": "1",
  "payload": {
    "type": "raise_hand",
    "reason": "I think my microphone is cutting out."
  }
}

5.6 `report_audio_issue`

Candidate or system reports audio degradation.

interface ReportAudioIssueCommand {
  type: "report_audio_issue";
  issueType: "no_input" | "echo" | "noise" | "dropout" | "latency";
  severity: "minor" | "major";
}

{
  "commandId": "cmd-audio-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:02:00.000Z",
  "source": "candidate",
  "type": "report_audio_issue",
  "schemaVersion": "1",
  "payload": {
    "type": "report_audio_issue",
    "issueType": "echo",
    "severity": "minor"
  }
}

5.7 `end_exam_requested`

Candidate or proctor requests early termination.

interface EndExamRequestedCommand {
  type: "end_exam_requested";
  requestedBy: "candidate" | "proctor";
  reason?: string;
}

{
  "commandId": "cmd-end-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:14:50.000Z",
  "source": "candidate",
  "type": "end_exam_requested",
  "schemaVersion": "1",
  "payload": {
    "type": "end_exam_requested",
    "requestedBy": "candidate",
    "reason": "I believe I have answered all questions."
  }
}

5.8 `request_rephrase`

Candidate asks the examiner to rephrase the question (not just repeat it). A candidate who asks “can you rephrase that?” exhibits a different cognitive strategy than one who asks “can you repeat that?” — the former suggests active engagement with the question’s framing, the latter suggests attentional failure. Joughin (1998) and Akimov & Malin (2020) describe oral assessment as dialogic; rephrase requests are a natural dialogue move.

interface RequestRephraseCommand {
  type: "request_rephrase";
  nodeId: string;
  reason?: "unclear_terminology" | "ambiguous_question" | "language_barrier";
}

{
  "commandId": "cmd-reph-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:00:42.000Z",
  "source": "candidate",
  "type": "request_rephrase",
  "schemaVersion": "1",
  "payload": {
    "type": "request_rephrase",
    "nodeId": "q-explain-dijkstra",
    "reason": "unclear_terminology"
  }
}

5.9 `challenge_premise`

Candidate questions or disagrees with a premise in the examiner’s question. In a genuine dialogue (Joughin’s interaction dimension), candidates may push back on a premise, question a framing, or disagree with a follow-up’s implication. This is assessment-significant: it demonstrates critical thinking and confidence.

interface ChallengePremiseCommand {
  type: "challenge_premise";
  nodeId: string;
  text: string;  // the candidate's challenge or disagreement
}

{
  "commandId": "cmd-chal-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:01:10.000Z",
  "source": "candidate",
  "type": "challenge_premise",
  "schemaVersion": "1",
  "payload": {
    "type": "challenge_premise",
    "nodeId": "q-graph-scenario",
    "text": "I don't think the assumption that all edges are non-negative holds in this scenario."
  }
}

5.10 `revise_earlier_answer`

Candidate requests to revisit and amend a previous node’s answer. Fenton (2025) notes that oral assessments allow “self-correction” — this command extends that affordance across nodes. The runtime MAY accept or reject this based on exam policy (e.g., disallowing revision of nodes where follow-ups have already been exhausted).

interface ReviseEarlierAnswerCommand {
  type: "revise_earlier_answer";
  targetNodeId: string;  // which previous node to revisit
  reason?: string;       // optional explanation
}

{
  "commandId": "cmd-rev-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:05:00.000Z",
  "source": "candidate",
  "type": "revise_earlier_answer",
  "schemaVersion": "1",
  "payload": {
    "type": "revise_earlier_answer",
    "targetNodeId": "q-explain-dijkstra",
    "reason": "I want to add something about the time complexity I missed earlier."
  }
}

5.11 `thinking_aloud`

Candidate explicitly signals they are thinking through a problem aloud. Joughin (1998) notes that oral assessment probes “understanding” — thinking time and process are evidence, not just operational delay. A candidate who says “let me think through this step by step” is signaling metacognitive awareness.

interface ThinkingAloudCommand {
  type: "thinking_aloud";
  nodeId: string;
}

{
  "commandId": "cmd-think-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:01:05.000Z",
  "source": "candidate",
  "type": "thinking_aloud",
  "schemaVersion": "1",
  "payload": {
    "type": "thinking_aloud",
    "nodeId": "q-graph-scenario"
  }
}

5.12 `emergency_stop`

Candidate requests immediate exam termination due to distress or emergency. Unlike end_exam_requested (a considered decision), this signals urgent distress. The runtime MUST halt the exam immediately and trigger a recovery_started event with type candidate_distress.

interface EmergencyStopCommand {
  type: "emergency_stop";
  reason?: "distress" | "medical" | "environmental" | "other";
}

{
  "commandId": "cmd-estop-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:10:00.000Z",
  "source": "candidate",
  "type": "emergency_stop",
  "schemaVersion": "1",
  "payload": {
    "type": "emergency_stop",
    "reason": "distress"
  }
}

5.13 `signal_confidence`

Candidate signals their confidence level in their own answer. Some oral exam traditions allow candidates to indicate self-assessment (Joughin, 1998, intrapersonal qualities dimension). This provides the marking pipeline with a self-assessment dimension that can be compared against the LLM’s evidence signals.

interface SignalConfidenceCommand {
  type: "signal_confidence";
  nodeId: string;
  confidenceLevel: "very_confident" | "confident" | "uncertain" | "guessing";
}

{
  "commandId": "cmd-conf-001",
  "sessionId": "sess-2026-05-06-001",
  "timestamp": "2026-05-06T02:01:15.000Z",
  "source": "candidate",
  "type": "signal_confidence",
  "schemaVersion": "1",
  "payload": {
    "type": "signal_confidence",
    "nodeId": "q-explain-dijkstra",
    "confidenceLevel": "confident"
  }
}

6. Ordering, Idempotency, and Replay Rules

6.1 Ordering

Every event within a session carries a seq field that MUST be monotonically increasing, starting from 1.
The runtime controller is the sole authority for assigning seq numbers. Bots and frontends MUST NOT self-assign seq.
Events with the same seq within a session are a protocol violation; consumers MUST reject the duplicate.
Consumers SHOULD tolerate gaps in seq (indicating filtered or dropped events) but MUST NOT assume contiguous sequences.

6.2 Idempotency

Each event carries a unique eventId (UUIDv7). Re-delivering an event with the same eventId MUST be treated as a no-op by all consumers.
Each command carries a unique commandId. The runtime controller MUST deduplicate commands by commandId within a 5-minute window.
Persistence layers MUST enforce a unique constraint on (sessionId, eventId).

6.3 Replay

A full session event log (all persisted events ordered by seq) MUST be sufficient to reconstruct the complete session timeline for audit and marking.
transcript_delta events are intentionally excluded from replay (UI-only, high volume).
Replay consumers MUST be able to process events without network access to the original bot or runtime — events are self-contained.
For marking replay, the minimum viable event stream is: node_entered, node_exited, transcript_final, examiner_utterance_final, evidence_signal, follow_up_used, transition_decision, guardrail_triggered, hesitation_detected, self_correction_detected, exam_completed.

6.4 Correlation

Related events (e.g., a transition sequence: node_exited → transition_decision → node_entered) share a correlationId.
Recovery sequences (recovery_started → recovery_resolved) share a recoveryId in their payloads AND a correlationId in the envelope.
Consumers SHOULD group events by correlationId when rendering timelines.

7. Transport

Channel	Direction	Typical Events	Notes
LiveKit Data Channel	Bot → Frontend	`transcript_delta`, `examiner_utterance_started`, `bot_ready`	Low-latency, unreliable (UDP). Used for real-time UI only.
WebSocket	Runtime → Frontend	All events	Reliable, ordered. Primary UI channel.
HTTP POST (webhook)	Runtime → Event Store	All persisted events	Batched, retried. Primary persistence channel.
Message Queue (Kafka/Redis Streams)	Runtime → Marking Pipeline	Marking-relevant events	Decouples marking from exam execution.

Events MAY be delivered over multiple transports simultaneously. The eventId ensures idempotency across all channels.

Revision History

Version	Date	Changes
v0.2.0	2026-06-30	Added anxiety-related event types. Updated terminology from ‘Exam Runtime IR’ to ‘IOA-ORM’.
v0.1.0	2026-05-06	Initial release.

Event Protocol

Status

1. Design Principles

2. Event Envelope Schema

Persistence / UI / Marking Flags

3. Command Envelope Schema

Command → Event Acknowledgement

4. Event Definitions and JSON Examples

4.1 bot_ready

4.2 node_entered

4.3 node_exited

4.4 transcript_delta

4.5 transcript_final

4.6 examiner_utterance_started

4.7 examiner_utterance_final

4.8 candidate_command_received

4.9 evidence_signal

4.10 follow_up_used

4.11 transition_decision

4.12 guardrail_triggered

4.13 recovery_started

4.14 recovery_resolved

4.15 exam_completed

4.16 hesitation_detected

4.17 self_correction_detected