Skip to content

Worked Examples

Draft · v0.2.0 · 2026-06-30

This chapter presents a complete, minimal but realistic example that exercises every major feature of the IOA-ORM. It is designed to be self-contained: a reader who has only skimmed the preceding chapters should be able to follow it end-to-end.


DimensionValue
CourseCS301 — Operating Systems (Year 3 undergraduate)
Duration10 minutes
LanguageEnglish
Assessment typeInteractive oral — 2 structured questions with follow-ups
Examiner personaSupportive, encouraging; asks for clarification when answers are vague
Max follow-ups per question2
Candidate commands availablerepeat, clarification, raise_hand
┌─────────┐   ┌──────────────┐   ┌──────────────┐   ┌─────────┐   ┌─────┐
│ OPENING  │──▶│  QUESTION_1  │──▶│  QUESTION_2  │──▶│ CLOSING │──▶│ END │
└─────────┘   └──────────────┘   └──────────────┘   └─────────┘   └─────┘
                  │ (≤2 follow-ups)   │ (≤2 follow-ups)
                  └───────────────────┘
  • LO-1: Explain the role of process scheduling in an OS (Knowledge)
  • LO-2: Compare scheduling algorithms and justify a choice (Analysis / Evaluation)

Lecturer’s view — what the examiner writes in the Assessment Studio.

“Welcome to your oral assessment for CS301. I’ll ask you two questions about operating systems. Feel free to ask me to repeat or clarify at any time. You may raise your hand if you need a moment. Let’s begin.”

Question 1 — Process Scheduling (target 3 min, max 4 min)

Section titled “Question 1 — Process Scheduling (target 3 min, max 4 min)”

Stem: “Can you explain what process scheduling means in the context of an operating system, and why it matters?”

Rubric signal: LO-1 — explanation of scheduling concept, mention of preemptive vs cooperative, context switch cost.

Follow-up 1 (if answer is vague or misses preemptive/cooperative): “You mentioned scheduling. Can you elaborate on the difference between preemptive and cooperative scheduling?”

Follow-up 2 (if still incomplete on context switch): “How does context switching fit into this picture?”

Transition condition: Candidate has addressed scheduling concept + at least one of {preemptive/cooperative, context switch}. OR max follow-ups exhausted. OR time budget exceeded.

Question 2 — Scheduling Algorithm Comparison (target 4 min, max 5 min)

Section titled “Question 2 — Scheduling Algorithm Comparison (target 4 min, max 5 min)”

Stem: “Suppose you’re designing a scheduler for an interactive desktop OS. Would you choose Round Robin or Shortest Job First? Why?”

Rubric signal: LO-2 — comparison of algorithms, awareness of starvation risk, response-time argument.

Follow-up 1 (if candidate doesn’t address starvation): “What problem could arise with the algorithm you didn’t choose?”

Follow-up 2 (if candidate still hasn’t mentioned response time): “From the user’s perspective, how would response time be affected?”

Transition condition: Candidate has compared at least one trade-off. OR max follow-ups exhausted. OR time budget exceeded.

“Thank you. That concludes your oral assessment. Your responses will be reviewed and you’ll receive your results within 5 working days.”


10.3 Compiled Domain Specification (ExamRuntimeIR)

Section titled “10.3 Compiled Domain Specification (ExamRuntimeIR)”

Note on schema conformance: The examples in this section use a simplified authoring-level representation for readability. The canonical TypeScript schema is defined in 02-schema.md. Key field name mappings:

Example fieldCanonical schema fieldSection
type (on nodes)kind§3 ExamRuntimeNodeKind
questionStempromptSeed§4 ExamRuntimeNode
timeBudgetSecondstimeBudgetMs (milliseconds)§4
followUps (inline array)followUpPolicy + candidateCommands§10, §13
guardrails.forbiddencandidateCommands.forbidden§13
transitionPolicy.allowedTargetstransitions array§11
irVersion: "1.0.0""exam-runtime-ir/0.1"§09

The event stream (§10.5) and evidence ledger (§10.6) use canonical field names.

{
  "irVersion": "exam-runtime-ir/0.1",
  "examId": "cs301-oral-2026s1-001",
  "version": "3",
  "metadata": {
    "courseCode": "CS301",
    "courseName": "Operating Systems",
    "assessmentType": "interactive_oral",
    "durationMinutes": 10,
    "language": "en-GB",
    "examinerPersona": {
      "tone": "supportive_encouraging",
      "style": "asks_for_clarification_when_vague"
    }
  },
  "timeBudget": {
    "totalSeconds": 600,
    "nodeBudgets": {
      "opening": 60,
      "q1": 180,
      "q2": 240,
      "closing": 60,
      "end": 0
    },
    "overrunPolicy": "warn_at_80pct_hard_at_100pct"
  },
  "nodes": [
    {
      "nodeId": "opening",
      "type": "opening",
      "prompt": "Welcome to your oral assessment for CS301. I'll ask you two questions about operating systems. Feel free to ask me to repeat or clarify at any time. You may raise your hand if you need a moment. Let's begin.",
      "transitions": [
        {
          "target": "q1",
          "condition": "always",
          "trigger": "examiner_action"
        }
      ]
    },
    {
      "nodeId": "q1",
      "type": "question",
      "questionStem": "Can you explain what process scheduling means in the context of an operating system, and why it matters?",
      "maxFollowUps": 2,
      "timeBudgetSeconds": 240,
      "learningOutcomes": ["LO-1"],
      "evidenceTargets": [
        {
          "id": "ev-q1-scheduling-concept",
          "description": "Explains what process scheduling is",
          "rubric": "Mentions CPU allocation, multiprogramming context",
          "level": "required"
        },
        {
          "id": "ev-q1-preemptive-cooperative",
          "description": "Distinguishes preemptive vs cooperative scheduling",
          "rubric": "Defines both, gives example or explains trade-off",
          "level": "expected"
        },
        {
          "id": "ev-q1-context-switch",
          "description": "Mentions context switch cost or mechanism",
          "rubric": "Explains save/restore state, overhead awareness",
          "level": "expected"
        }
      ],
      "followUps": [
        {
          "followUpId": "q1-fu1",
          "ordinal": 1,
          "prompt": "You mentioned scheduling. Can you elaborate on the difference between preemptive and cooperative scheduling?",
          "triggerCondition": "answer_is_vague OR missing_preemptive_cooperative",
          "evidenceTargets": ["ev-q1-preemptive-cooperative"]
        },
        {
          "followUpId": "q1-fu2",
          "ordinal": 2,
          "prompt": "How does context switching fit into this picture?",
          "triggerCondition": "missing_context_switch",
          "evidenceTargets": ["ev-q1-context-switch"]
        }
      ],
      "transitionPolicy": {
        "allowedTargets": ["q2"],
        "decisionMode": "runtime_approval",
        "conditions": [
          {
            "id": "q1-sufficient",
            "description": "Sufficient evidence collected OR follow-ups exhausted OR time budget exceeded",
            "expression": "evidence_covered(['ev-q1-scheduling-concept']) AND (evidence_covered(['ev-q1-preemptive-cooperative','ev-q1-context-switch']) OR follow_up_count >= maxFollowUps OR time_budget_exceeded)"
          }
        ]
      },
      "guardrails": {
        "forbidden": ["reveal_rubric", "reveal_score", "suggest_answer"],
        "forbidden_topics": ["exam_format_policy", "grading_threshold"],
        "maxCandidateSilenceSeconds": 15,
        "silenceAction": "gentle_prompt"
      }
    },
    {
      "nodeId": "q2",
      "type": "question",
      "questionStem": "Suppose you're designing a scheduler for an interactive desktop OS. Would you choose Round Robin or Shortest Job First? Why?",
      "maxFollowUps": 2,
      "timeBudgetSeconds": 300,
      "learningOutcomes": ["LO-2"],
      "evidenceTargets": [
        {
          "id": "ev-q2-algorithm-choice",
          "description": "Chooses an algorithm and provides a rationale",
          "rubric": "Names algorithm, links to interactive desktop context",
          "level": "required"
        },
        {
          "id": "ev-q2-starvation",
          "description": "Addresses starvation risk of the other algorithm",
          "rubric": "Explains SJF starvation scenario or RR fairness trade-off",
          "level": "expected"
        },
        {
          "id": "ev-q2-response-time",
          "description": "Considers response time from user perspective",
          "rubric": "Mentions interactive responsiveness, jitter, or latency",
          "level": "expected"
        }
      ],
      "followUps": [
        {
          "followUpId": "q2-fu1",
          "ordinal": 1,
          "prompt": "What problem could arise with the algorithm you didn't choose?",
          "triggerCondition": "missing_starvation",
          "evidenceTargets": ["ev-q2-starvation"]
        },
        {
          "followUpId": "q2-fu2",
          "ordinal": 2,
          "prompt": "From the user's perspective, how would response time be affected?",
          "triggerCondition": "missing_response_time",
          "evidenceTargets": ["ev-q2-response-time"]
        }
      ],
      "transitionPolicy": {
        "allowedTargets": ["closing"],
        "decisionMode": "runtime_approval",
        "conditions": [
          {
            "id": "q2-sufficient",
            "description": "Sufficient evidence collected OR follow-ups exhausted OR time budget exceeded",
            "expression": "evidence_covered(['ev-q2-algorithm-choice']) AND (evidence_covered(['ev-q2-starvation','ev-q2-response-time']) OR follow_up_count >= maxFollowUps OR time_budget_exceeded)"
          }
        ]
      },
      "guardrails": {
        "forbidden": ["reveal_rubric", "reveal_score", "suggest_answer"],
        "forbidden_topics": ["exam_format_policy", "grading_threshold"],
        "maxCandidateSilenceSeconds": 15,
        "silenceAction": "gentle_prompt"
      }
    },
    {
      "nodeId": "closing",
      "type": "closing",
      "prompt": "Thank you. That concludes your oral assessment. Your responses will be reviewed and you'll receive your results within 5 working days.",
      "transitions": [
        {
          "target": "end",
          "condition": "always",
          "trigger": "examiner_action"
        }
      ]
    },
    {
      "nodeId": "end",
      "type": "end",
      "postExamAction": "trigger_marking_runtime"
    }
  ],
  "candidateCommands": {
    "repeat": {
      "description": "Candidate asks examiner to repeat the current question or last statement",
      "runtimeAction": "re_prompt_current",
      "costsFollowUp": false,
      "maxPerNode": 3
    },
    "clarification": {
      "description": "Candidate asks for clarification of a term or concept",
      "runtimeAction": "llm_provides_clarification_within_guardrails",
      "costsFollowUp": false,
      "maxPerNode": 2,
      "guardrail": "must_not_reveal_answer_or_rubric"
    },
    "raise_hand": {
      "description": "Candidate signals they need a moment (pause timer)",
      "runtimeAction": "pause_time_budget",
      "costsFollowUp": false,
      "maxPerNode": 2,
      "pauseDurationSeconds": 10
    }
  }
}

The adapter compiles the domain specification into a Pipecat FlowManager-compatible configuration. The runtime controller layer sits between them, so this is an intermediate representation — not the source of truth.

{
  "flow": {
    "initial_node": "opening",
    "nodes": {
      "opening": {
        "task_messages": [
          {
            "role": "system",
            "content": "You are an oral exam examiner. Be supportive and encouraging. ..."
          },
          {
            "role": "assistant",
            "content": "Welcome to your oral assessment for CS301. I'll ask you two questions about operating systems. Feel free to ask me to repeat or clarify at any time. You may raise your hand if you need a moment. Let's begin."
          }
        ],
        "pre_actions": [
          {
            "type": "emit_event",
            "event": { "type": "node_entered", "nodeId": "opening" }
          }
        ],
        "post_actions": [
          {
            "type": "emit_event",
            "event": { "type": "node_exited", "nodeId": "opening" }
          }
        ],
        "edges": [
          {
            "target": "q1",
            "transition_to": "q1",
            "interruptible": false
          }
        ]
      },
      "q1": {
        "task_messages": [
          {
            "role": "system",
            "content": "You are an oral exam examiner assessing LO-1: Process scheduling. Ask the stem question, then listen carefully. You may ask up to 2 follow-ups if the answer is vague or incomplete. Never reveal the rubric or suggest answers. If the candidate says 'repeat', re-ask the question. If they say 'clarification', explain a term without giving the answer. If they say 'raise_hand', wait 10 seconds. ..."
          },
          {
            "role": "assistant",
            "content": "Can you explain what process scheduling means in the context of an operating system, and why it matters?"
          }
        ],
        "pre_actions": [
          {
            "type": "emit_event",
            "event": { "type": "node_entered", "nodeId": "q1" }
          },
          {
            "type": "set_runtime_state",
            "key": "currentNode.followUpCount",
            "value": 0
          }
        ],
        "post_actions": [
          {
            "type": "emit_event",
            "event": { "type": "node_exited", "nodeId": "q1" }
          }
        ],
        "edges": [
          {
            "target": "q2",
            "transition_to": "q2",
            "interruptible": false,
            "condition": "runtime_approves_transition"
          }
        ],
        "runtime_config": {
          "maxFollowUps": 2,
          "timeBudgetSeconds": 240,
          "evidenceTargets": [
            "ev-q1-scheduling-concept",
            "ev-q1-preemptive-cooperative",
            "ev-q1-context-switch"
          ]
        }
      },
      "q2": {
        "task_messages": [
          {
            "role": "system",
            "content": "You are an oral exam examiner assessing LO-2: Scheduling algorithm comparison. Ask the stem question. You may ask up to 2 follow-ups. ..."
          },
          {
            "role": "assistant",
            "content": "Suppose you're designing a scheduler for an interactive desktop OS. Would you choose Round Robin or Shortest Job First? Why?"
          }
        ],
        "pre_actions": [
          {
            "type": "emit_event",
            "event": { "type": "node_entered", "nodeId": "q2" }
          },
          {
            "type": "set_runtime_state",
            "key": "currentNode.followUpCount",
            "value": 0
          }
        ],
        "post_actions": [
          {
            "type": "emit_event",
            "event": { "type": "node_exited", "nodeId": "q2" }
          }
        ],
        "edges": [
          {
            "target": "closing",
            "transition_to": "closing",
            "interruptible": false,
            "condition": "runtime_approves_transition"
          }
        ],
        "runtime_config": {
          "maxFollowUps": 2,
          "timeBudgetSeconds": 300,
          "evidenceTargets": [
            "ev-q2-algorithm-choice",
            "ev-q2-starvation",
            "ev-q2-response-time"
          ]
        }
      },
      "closing": {
        "task_messages": [
          {
            "role": "system",
            "content": "You are concluding the oral exam. Deliver the closing statement. Do not discuss performance."
          },
          {
            "role": "assistant",
            "content": "Thank you. That concludes your oral assessment. Your responses will be reviewed and you'll receive your results within 5 working days."
          }
        ],
        "pre_actions": [
          {
            "type": "emit_event",
            "event": { "type": "node_entered", "nodeId": "closing" }
          }
        ],
        "post_actions": [
          {
            "type": "emit_event",
            "event": { "type": "node_exited", "nodeId": "closing" }
          }
        ],
        "edges": [
          {
            "target": "end",
            "transition_to": "end",
            "interruptible": false
          }
        ]
      },
      "end": {
        "task_messages": [],
        "post_actions": [
          {
            "type": "emit_event",
            "event": { "type": "exam_completed" }
          },
          {
            "type": "trigger_marking_runtime",
            "payload": { "source": "exam_runtime" }
          }
        ]
      }
    }
  }
}

The following is the chronological event stream for a realistic exam session. Timestamps are relative to exam start (T=0). Some events are omitted for brevity; a indicates a gap.

T+0.000s  bot_ready
            { "examId": "cs301-oral-2026s1-001", "sessionId": "sess-7a3f" }

T+0.120s  node_entered
            { "nodeId": "opening", "nodeType": "opening" }

T+0.200s  transcript_delta
            { "nodeId": "opening", "speaker": "examiner",
              "text": "Welcome to your oral assessment for CS301.", "isFinal": false }

T+2.400s  transcript_final
            { "nodeId": "opening", "speaker": "examiner",
              "text": "Welcome to your oral assessment for CS301. I'll ask you two questions about operating systems. Feel free to ask me to repeat or clarify at any time. You may raise your hand if you need a moment. Let's begin.",
              "spanId": "sp-001" }

T+2.401s  node_exited
            { "nodeId": "opening" }

T+2.402s  node_entered
            { "nodeId": "q1", "nodeType": "question" }

T+2.500s  node_progress
            { "nodeId": "q1", "followUpCount": 0, "maxFollowUps": 2,
              "evidenceCovered": [], "timeBudgetRemainingSeconds": 240 }

T+2.600s  transcript_delta
            { "nodeId": "q1", "speaker": "examiner",
              "text": "Can you explain what process scheduling means", "isFinal": false }

T+5.800s  transcript_final
            { "nodeId": "q1", "speaker": "examiner",
              "text": "Can you explain what process scheduling means in the context of an operating system, and why it matters?",
              "spanId": "sp-002" }

T+8.200s  transcript_delta
            { "nodeId": "q1", "speaker": "candidate",
              "text": "Process scheduling is when the OS decides", "isFinal": false }

T+14.600s transcript_final
            { "nodeId": "q1", "speaker": "candidate",
              "text": "Process scheduling is when the operating system decides which process gets to use the CPU at any given time. It's important because there are usually more processes than CPUs, so the OS has to manage sharing. It also helps with making sure important tasks get done first.",
              "spanId": "sp-003" }

T+14.700s evidence_signal
            { "nodeId": "q1", "evidenceTargetId": "ev-q1-scheduling-concept",
              "transcriptSpanId": "sp-003",
              "signal": "covered",
              "confidence": 0.92,
              "rationale": "Candidate described CPU allocation and multiprogramming context." }

T+14.701s node_progress
            { "nodeId": "q1", "followUpCount": 0, "maxFollowUps": 2,
              "evidenceCovered": ["ev-q1-scheduling-concept"],
              "timeBudgetRemainingSeconds": 225 }

T+14.800s transition_decision
            { "nodeId": "q1", "decision": "follow_up",
              "reason": "Evidence for ev-q1-preemptive-cooperative and ev-q1-context-switch not yet covered. Follow-up 1 trigger condition 'missing_preemptive_cooperative' met.",
              "followUpOrdinal": 1 }

T+15.000s transcript_delta
            { "nodeId": "q1", "speaker": "examiner",
              "text": "You mentioned scheduling. Can you elaborate", "isFinal": false }

T+17.200s transcript_final
            { "nodeId": "q1", "speaker": "examiner",
              "text": "You mentioned scheduling. Can you elaborate on the difference between preemptive and cooperative scheduling?",
              "spanId": "sp-004" }

T+17.300s node_progress
            { "nodeId": "q1", "followUpCount": 1, "maxFollowUps": 2,
              "evidenceCovered": ["ev-q1-scheduling-concept"],
              "timeBudgetRemainingSeconds": 223 }

T+19.800s transcript_delta
            { "nodeId": "q1", "speaker": "candidate",
              "text": "Sorry, can you repeat the question?", "isFinal": false }

T+21.400s transcript_final
            { "nodeId": "q1", "speaker": "candidate",
              "text": "Sorry, can you repeat the question?",
              "spanId": "sp-005" }

T+21.500s candidate_command
            { "nodeId": "q1", "command": "repeat",
              "triggeredBy": "candidate_utterance",
              "rawText": "Sorry, can you repeat the question?",
              "costsFollowUp": false,
              "followUpCountAfter": 1 }

T+21.600s transcript_delta
            { "nodeId": "q1", "speaker": "examiner",
              "text": "Of course. I'm asking about the difference", "isFinal": false }

T+23.800s transcript_final
            { "nodeId": "q1", "speaker": "examiner",
              "text": "Of course. I'm asking about the difference between preemptive and cooperative scheduling. In preemptive scheduling, the OS can interrupt a running process. In cooperative scheduling, the process must voluntarily yield. Can you tell me more about that?",
              "spanId": "sp-006" }

T+23.900s node_progress
            { "nodeId": "q1", "followUpCount": 1, "maxFollowUps": 2,
              "evidenceCovered": ["ev-q1-scheduling-concept"],
              "timeBudgetRemainingSeconds": 216 }

T+27.100s transcript_delta
            { "nodeId": "q1", "speaker": "candidate",
              "text": "Right, so in preemptive the OS can stop", "isFinal": false }

T+36.500s transcript_final
            { "nodeId": "q1", "speaker": "candidate",
              "text": "Right, so in preemptive scheduling the OS can stop a process at any time and switch to another one. This is what most modern OSes like Linux and Windows use. Cooperative scheduling is where the process has to give up control itself, like in older versions of Windows where if a program froze, the whole system could hang.",
              "spanId": "sp-007" }

T+36.600s evidence_signal
            { "nodeId": "q1", "evidenceTargetId": "ev-q1-preemptive-cooperative",
              "transcriptSpanId": "sp-007",
              "signal": "covered",
              "confidence": 0.95,
              "rationale": "Defined both types with real-world OS examples." }

T+36.601s node_progress
            { "nodeId": "q1", "followUpCount": 1, "maxFollowUps": 2,
              "evidenceCovered": ["ev-q1-scheduling-concept", "ev-q1-preemptive-cooperative"],
              "timeBudgetRemainingSeconds": 203 }

T+36.700s transition_decision
            { "nodeId": "q1", "decision": "follow_up",
              "reason": "ev-q1-context-switch still not covered. Follow-up 2 trigger condition 'missing_context_switch' met.",
              "followUpOrdinal": 2 }

T+37.000s transcript_final
            { "nodeId": "q1", "speaker": "examiner",
              "text": "Great examples. How does context switching fit into this picture?",
              "spanId": "sp-008" }

T+37.100s node_progress
            { "nodeId": "q1", "followUpCount": 2, "maxFollowUps": 2,
              "evidenceCovered": ["ev-q1-scheduling-concept", "ev-q1-preemptive-cooperative"],
              "timeBudgetRemainingSeconds": 203 }

T+42.300s transcript_final
            { "nodeId": "q1", "speaker": "candidate",
              "text": "Context switching is when the OS saves the state of the current process and loads the state of the next one. It's the mechanism that makes preemptive scheduling possible. But it has overhead — saving registers, updating memory maps — so you don't want to do it too frequently.",
              "spanId": "sp-009" }

T+42.400s evidence_signal
            { "nodeId": "q1", "evidenceTargetId": "ev-q1-context-switch",
              "transcriptSpanId": "sp-009",
              "signal": "covered",
              "confidence": 0.90,
              "rationale": "Described save/restore mechanism and overhead awareness." }

T+42.401s node_progress
            { "nodeId": "q1", "followUpCount": 2, "maxFollowUps": 2,
              "evidenceCovered": ["ev-q1-scheduling-concept", "ev-q1-preemptive-cooperative", "ev-q1-context-switch"],
              "timeBudgetRemainingSeconds": 197 }

T+42.500s transition_decision
            { "nodeId": "q1", "decision": "move_to_next_node",
              "reason": "All expected evidence covered. Transition condition 'q1-sufficient' satisfied.",
              "targetNodeId": "q2" }

T+42.600s node_exited
            { "nodeId": "q1" }

T+42.700s node_entered
            { "nodeId": "q2", "nodeType": "question" }

T+42.800s node_progress
            { "nodeId": "q2", "followUpCount": 0, "maxFollowUps": 2,
              "evidenceCovered": [], "timeBudgetRemainingSeconds": 240 }

T+43.000s transcript_final
            { "nodeId": "q2", "speaker": "examiner",
              "text": "Suppose you're designing a scheduler for an interactive desktop OS. Would you choose Round Robin or Shortest Job First? Why?",
              "spanId": "sp-010" }



T+120.400s candidate_command
            { "nodeId": "q2", "command": "raise_hand",
              "triggeredBy": "candidate_utterance",
              "rawText": "Can I have a moment to think?",
              "costsFollowUp": false,
              "followUpCountAfter": 0,
              "pauseDurationSeconds": 10 }

T+120.500s time_budget_paused
            { "nodeId": "q2", "pauseUntil": "T+130.400" }

T+130.400s time_budget_resumed
            { "nodeId": "q2", "timeBudgetRemainingSeconds": 170 }



T+240.000s time_budget_exceeded
            { "nodeId": "q2" }

T+240.100s transition_decision
            { "nodeId": "q2", "decision": "move_to_next_node",
              "reason": "Time budget exceeded. Evidence collected: ev-q2-algorithm-choice (covered), ev-q2-starvation (covered), ev-q2-response-time (not covered). Hard move enforced by overrun policy.",
              "targetNodeId": "closing" }

T+240.200s node_exited
            { "nodeId": "q2" }

T+240.300s node_entered
            { "nodeId": "closing", "nodeType": "closing" }

T+241.000s transcript_final
            { "nodeId": "closing", "speaker": "examiner",
              "text": "Thank you. That concludes your oral assessment. Your responses will be reviewed and you'll receive your results within 5 working days.",
              "spanId": "sp-020" }

T+243.000s node_exited
            { "nodeId": "closing" }

T+243.100s node_entered
            { "nodeId": "end", "nodeType": "end" }

T+243.200s exam_completed
            { "examId": "cs301-oral-2026s1-001", "sessionId": "sess-7a3f",
              "totalDurationSeconds": 243.2,
              "nodesVisited": ["opening", "q1", "q2", "closing", "end"],
              "totalFollowUpsUsed": 3 }

The evidence ledger is populated from evidence_signal events and persisted by the event store. It is the primary input to the marking runtime.

{
  "examId": "cs301-oral-2026s1-001",
  "sessionId": "sess-7a3f",
  "candidateId": "stu-202400042",
  "generatedAt": "2026-05-06T02:07:43.200Z",
  "entries": [
    {
      "evidenceTargetId": "ev-q1-scheduling-concept",
      "nodeId": "q1",
      "learningOutcome": "LO-1",
      "signal": "covered",
      "confidence": 0.92,
      "transcriptSpanIds": ["sp-003"],
      "transcriptExcerpt": "Process scheduling is when the operating system decides which process gets to use the CPU at any given time. It's important because there are usually more processes than CPUs, so the OS has to manage sharing.",
      "rationale": "Candidate described CPU allocation and multiprogramming context.",
      "timestamp": "T+14.700s"
    },
    {
      "evidenceTargetId": "ev-q1-preemptive-cooperative",
      "nodeId": "q1",
      "learningOutcome": "LO-1",
      "signal": "covered",
      "confidence": 0.95,
      "transcriptSpanIds": ["sp-007"],
      "transcriptExcerpt": "In preemptive scheduling the OS can stop a process at any time and switch to another one. This is what most modern OSes like Linux and Windows use. Cooperative scheduling is where the process has to give up control itself.",
      "rationale": "Defined both types with real-world OS examples.",
      "timestamp": "T+36.600s"
    },
    {
      "evidenceTargetId": "ev-q1-context-switch",
      "nodeId": "q1",
      "learningOutcome": "LO-1",
      "signal": "covered",
      "confidence": 0.90,
      "transcriptSpanIds": ["sp-009"],
      "transcriptExcerpt": "Context switching is when the OS saves the state of the current process and loads the state of the next one. It's the mechanism that makes preemptive scheduling possible. But it has overhead.",
      "rationale": "Described save/restore mechanism and overhead awareness.",
      "timestamp": "T+42.400s"
    },
    {
      "evidenceTargetId": "ev-q2-algorithm-choice",
      "nodeId": "q2",
      "learningOutcome": "LO-2",
      "signal": "covered",
      "confidence": 0.88,
      "transcriptSpanIds": ["sp-012"],
      "transcriptExcerpt": "I would choose Round Robin because it gives each process a fair time slice, which means the system stays responsive to user input even under load.",
      "rationale": "Named algorithm, linked to interactive context.",
      "timestamp": "T+62.300s"
    },
    {
      "evidenceTargetId": "ev-q2-starvation",
      "nodeId": "q2",
      "learningOutcome": "LO-2",
      "signal": "covered",
      "confidence": 0.85,
      "transcriptSpanIds": ["sp-015"],
      "transcriptExcerpt": "SJF can lead to starvation if short jobs keep arriving — a long job might never get scheduled. That's why it's not great for interactive systems where any task could need attention.",
      "rationale": "Explained starvation scenario linked to interactive context.",
      "timestamp": "T+98.700s"
    },
    {
      "evidenceTargetId": "ev-q2-response-time",
      "nodeId": "q2",
      "learningOutcome": "LO-2",
      "signal": "not_covered",
      "confidence": null,
      "transcriptSpanIds": [],
      "transcriptExcerpt": null,
      "rationale": "Time budget exhausted before evidence could be collected.",
      "timestamp": "T+240.100s"
    }
  ],
  "summary": {
    "totalTargets": 6,
    "covered": 5,
    "notCovered": 1,
    "coverageRate": 0.833
  }
}

Scenario: Candidate asks to repeat the question

Section titled “Scenario: Candidate asks to repeat the question”

The candidate says: “Sorry, can you repeat the question?”

Runtime behaviour:

  1. STT produces transcript_final with the candidate’s utterance (span sp-005).
  2. The runtime controller’s command classifier detects the repeat intent.
  3. A candidate_command event is emitted (see §10.5, T+21.500s).
  4. The runtime does NOT increment followUpCount — this is a clarification request, not a substantive answer attempt.
  5. The LLM re-asks the current follow-up prompt with slight rephrasing.
  6. A new transcript_final is emitted for the examiner’s re-prompt (span sp-006).

Key invariant: followUpCount remains at 1 after the repeat. The candidate’s ability to answer is not penalised by a repeat request.

Scenario: Candidate asks for clarification of a term

Section titled “Scenario: Candidate asks for clarification of a term”

The candidate says: “What do you mean by starvation?”

Runtime behaviour:

  1. Command classifier detects clarification intent.
  2. candidate_command event emitted with command: "clarification".
  3. LLM provides a brief, rubric-safe explanation — e.g., “Starvation means a process waits indefinitely because shorter jobs keep getting priority.”
  4. The LLM must NOT say “That’s exactly what I’m looking for” or hint at the rubric (guardrail enforced).
  5. followUpCount is NOT incremented.

Scenario: LLM attempts to reveal the rubric

Section titled “Scenario: LLM attempts to reveal the rubric”

During Q1 follow-up, the LLM’s generated response includes:

“That’s a good point! You’ve actually covered the preemptive vs cooperative distinction, which is one of the key rubric items for this question.”

Runtime controller response:

  1. The LLM output is intercepted by the guardrail layer before TTS delivery.
  2. The guardrail checks against forbidden: ["reveal_rubric"].
  3. The response is blocked — TTS does not play it to the candidate.
  4. A guardrail_violation event is emitted:
{
  "type": "guardrail_violation",
  "nodeId": "q1",
  "rule": "reveal_rubric",
  "severity": "blocked",
  "originalText": "That's a good point! You've actually covered the preemptive vs cooperative distinction, which is one of the key rubric items for this question.",
  "replacementAction": "regenerate_response",
  "timestamp": "T+38.200s"
}
  1. The LLM is prompted to regenerate without rubric references.
  2. The regenerated response plays to the candidate.

Scenario: LLM attempts to transition to an unauthorised node

Section titled “Scenario: LLM attempts to transition to an unauthorised node”

The LLM, after Q1, attempts to generate text that would skip Q2 and go to closing:

“Excellent work! Let’s wrap up the assessment.”

Runtime controller response:

  1. The runtime detects the LLM is attempting to exit Q1.
  2. Q1’s transitionPolicy.allowedTargets is ["q2"].
  3. The LLM does not have authority to decide the transition — only the runtime can approve move_to_next_node.
  4. The runtime blocks the premature closing and re-injects the Q2 stem:
{
  "type": "guardrail_violation",
  "nodeId": "q1",
  "rule": "unauthorized_transition",
  "severity": "blocked",
  "originalIntent": "move_to_closing",
  "allowedTargets": ["q2"],
  "replacementAction": "inject_next_question",
  "timestamp": "T+42.800s"
}

After exam_completed fires, the marking runtime receives a structured input package. Below is a simplified excerpt.

{
  "inputVersion": "1.0.0",
  "examId": "cs301-oral-2026s1-001",
  "sessionId": "sess-7a3f",
  "candidateId": "stu-202400042",
  "examRuntimeVersion": "1.0.0",
  "evidenceLedger": {
    // ... full ledger as shown in §10.6 ...
  },
  "transcript": {
    "totalSpans": 20,
    "fullText": "…", // concatenated transcript with span IDs
    "spans": [
      // Each span: { spanId, nodeId, speaker, text, startTime, endTime }
    ]
  },
  "runtimeAudit": {
    "nodesVisited": ["opening", "q1", "q2", "closing", "end"],
    "totalDurationSeconds": 243.2,
    "followUpsUsed": {
      "q1": 2,
      "q2": 1
    },
    "transitionDecisions": [
      {
        "nodeId": "q1",
        "decision": "move_to_next_node",
        "reason": "All expected evidence covered.",
        "timestamp": "T+42.500s"
      },
      {
        "nodeId": "q2",
        "decision": "move_to_next_node",
        "reason": "Time budget exceeded.",
        "timestamp": "T+240.100s"
      }
    ],
    "candidateCommandsUsed": [
      { "nodeId": "q1", "command": "repeat", "timestamp": "T+21.500s" },
      { "nodeId": "q2", "command": "raise_hand", "timestamp": "T+120.400s" }
    ],
    "guardrailViolations": []
  },
  "irSnapshot": {
    // Frozen copy of the ExamRuntimeIR used for this session
    // Enables marking to reference the exact rubric/evidence targets
    // that were active during the exam
  }
}

Key properties of the marking input:

  • Evidence ledger is pre-populated with LLM confidence scores and transcript excerpts. The marking runtime may confirm, override, or supplement these.
  • Transcript spans are linked to evidence targets, enabling human markers to verify the AI’s evidence assessment.
  • Runtime audit provides context: why transitions happened, which commands the candidate used, and whether any guardrail violations occurred.
  • IR snapshot freezes the assessment definition so that marking always references the exact rubric that was in effect during the exam.

FeatureWhere Demonstrated
Node progress updatesnode_progress events throughout §10.5
Follow-up count runtime controlQ1 follow-ups 1→2, never exceeding maxFollowUps: 2
Candidate repeat ≠ follow-upT+21.500s candidate_command with costsFollowUp: false
Evidence signal → transcript spanevidence_signal with transcriptSpanId references
Transition decision with reasontransition_decision events at T+14.800s, T+36.700s, T+42.500s, T+240.100s
exam_completed → marking runtime§10.9 structured input package
Guardrail blocking rubric reveal§10.8 first scenario
Guardrail blocking unauthorised transition§10.8 second scenario
Time budget enforcementQ2 time budget exceeded at T+240.000s

10.11 INFOSYS110 — Full IOA Example (University of Auckland)

Section titled “10.11 INFOSYS110 — Full IOA Example (University of Auckland)”

Based on the INFOSYS110 Interactive Oral Assessment Handbook. This example demonstrates a multi-segment, scenario-based IOA for a first-year Business Information Systems course. It exercises persona consistency, rubric-level nudging, transversal skills, scaffolding, and the report_observation protocol.

DimensionValue
CourseINFOSYS 110 — Business Information Systems (Stage I)
Duration20 minutes (4 segments × ~5 min each)
LanguageEnglish
Assessment typeInteractive oral — scenario-based conversation across 4 segments
Examiner personaHotel General Manager (professionally-focused scenario)
Core caseA hotel chain considering digital transformation of its operations
Max follow-ups per segment3
Candidate commandsrepeat, clarify, slow_down, pause, help
ScaffoldingEnabled — 2-minute practice conversation before exam starts
Transversal skillscritical_thinking, professional_communication, problem_solving
┌────────────┐   ┌───────────┐   ┌───────────┐   ┌───────────┐   ┌───────────┐   ┌──────────┐   ┌─────┐
│ SCAFFOLDING│──▶│ SEGMENT_1 │──▶│ SEGMENT_2 │──▶│ SEGMENT_3 │──▶│ SEGMENT_4 │──▶│ CLOSING  │──▶│ END │
│ (practice) │   │ Digital   │   │ IS Roles  │   │ Data      │   │ Change &  │   │          │   │     │
│            │   │ Foundat.  │   │ & BI      │   │ Govern.   │   │ Loyalty   │   │          │   │     │
└────────────┘   └───────────┘   └───────────┘   └───────────┘   └───────────┘   └──────────┘   └─────┘
                   │ ≤3 FU         │ ≤3 FU         │ ≤3 FU         │ ≤3 FU

10.11.3 Compiled Specification — Key Nodes

Section titled “10.11.3 Compiled Specification — Key Nodes”
{
  "irVersion": "exam-runtime-ir/0.1",
  "examId": "infosys110-ioa-2026s1-001",
  "examVersion": 1,
  "metadata": {
    "courseCode": "INFOSYS110",
    "courseName": "Business Information Systems",
    "institution": "University of Auckland",
    "assessmentType": "interactive_oral",
    "durationMinutes": 20,
    "language": "en-NZ",
    "communicationStyleIsLearningOutcome": false,
    "equivalentWrittenWordCount": 3000
  },
  "transversalSkills": [
    {
      "skillId": "critical_thinking",
      "description": "Analyses information, evaluates alternatives, forms reasoned judgements"
    },
    {
      "skillId": "professional_communication",
      "description": "Communicates ideas clearly in a professional context"
    },
    {
      "skillId": "problem_solving",
      "description": "Identifies problems and proposes practical solutions"
    }
  ],
  "scaffolding": {
    "enabled": true,
    "scenario": "A brief practice conversation to familiarise you with the assessment format. This does NOT count toward your score.",
    "conversationPrompt": "Let's do a quick practice. Imagine you're telling a friend about a new app you've been using. What does it do and why do you like it?",
    "maxDurationSeconds": 120,
    "feedbackEnabled": true
  },
  "nodes": [
    {
      "nodeId": "segment_1_digital_foundations",
      "type": "scenario_segment",
      "persona": "You are the General Manager of a mid-range hotel chain in New Zealand. You are meeting with a junior team member to discuss the hotel's digital transformation strategy.",
      "scenario": "The hotel chain is considering upgrading its booking system, implementing a mobile check-in app, and introducing IoT sensors for room management. You want to understand the candidate's grasp of digital foundations and operational trade-offs.",
      "conversationPrompt": "Thanks for coming in. We're looking at some big technology changes for the chain. Can you walk me through what you think are the key digital foundations we need to get right before we invest in new systems?",
      "evidenceSignals": [
        {
          "signalId": "ev-digital-transformation-understanding",
          "description": "Demonstrates understanding of what digital transformation means in a business context",
          "levels": ["basic_awareness", "applied_understanding", "strategic_insight"],
          "weight": 0.25
        },
        {
          "signalId": "ev-operational-trade-offs",
          "description": "Identifies trade-offs in technology adoption (cost vs benefit, disruption vs efficiency)",
          "levels": ["lists_factors", "analyses_trade_offs", "evaluates_with_evidence"],
          "weight": 0.25
        },
        {
          "signalId": "ev-infrastructure-awareness",
          "description": "Recognises infrastructure prerequisites (network, integration, training)",
          "levels": ["mentions_awareness", "explains_dependencies", "proposes_implementation_plan"],
          "weight": 0.25
        },
        {
          "signalId": "ev-customer-impact",
          "description": "Considers how technology changes affect the customer experience",
          "levels": ["acknowledges_impact", "analyses_customer_journey", "proposes_customer_centric_approach"],
          "weight": 0.25
        }
      ],
      "maxFollowUps": 3,
      "timeBudgetSeconds": 300,
      "followUpBank": [
        {
          "followUpId": "s1-fu1",
          "type": "nudge",
          "prompt": "That's a good overview. Can you tell me more about why you think those specific foundations matter — what could go wrong if we skip them?",
          "triggerCondition": "basic_awareness_level AND missing_trade_offs"
        },
        {
          "followUpId": "s1-fu2",
          "type": "probe",
          "prompt": "How would you prioritise these? If we could only do one thing first, what would it be and why?",
          "triggerCondition": "lists_factors BUT no prioritisation"
        },
        {
          "followUpId": "s1-fu3",
          "type": "challenge",
          "prompt": "Some staff might resist these changes. How does that factor into your thinking?",
          "triggerCondition": "missing_change_management_awareness"
        }
      ],
      "transitionConditions": [
        {
          "id": "s1-sufficient",
          "expression": "signal_count >= 3 AND any_signal_level >= 'applied_understanding'"
        },
        {
          "id": "s1-time-exhausted",
          "expression": "time_budget_exceeded"
        },
        {
          "id": "s1-followups-exhausted",
          "expression": "follow_up_count >= 3"
        }
      ],
      "guardrails": {
        "forbidden": ["reveal_rubric", "reveal_score", "suggest_answer", "mention_other_segments"],
        "personaBreakPatterns": ["As your examiner", "In this assessment", "Let me ask you another question"]
      }
    },
    {
      "nodeId": "segment_2_is_roles_bi",
      "type": "scenario_segment",
      "persona": "You are the General Manager continuing the meeting. Now focusing on information systems roles and how business intelligence can support decision-making.",
      "scenario": "After discussing digital foundations, you want to explore how different IS roles (database admin, systems analyst, CIO) contribute to the hotel's success, and how business intelligence dashboards could help manage operations.",
      "conversationPrompt": "Good. Now, we're also thinking about building a business intelligence dashboard for our regional managers. Can you explain what roles in an IS team would be involved in making that happen, and what kind of insights the dashboard should provide?",
      "evidenceSignals": [
        {
          "signalId": "ev-is-roles-knowledge",
          "description": "Identifies and explains key IS roles (DBA, systems analyst, CIO, etc.)",
          "levels": ["names_roles", "explains_responsibilities", "maps_roles_to_outcomes"],
          "weight": 0.3
        },
        {
          "signalId": "ev-bi-understanding",
          "description": "Demonstrates understanding of business intelligence concepts",
          "levels": ["defines_bi", "explains_bi_value", "proposes_bi_use_case"],
          "weight": 0.35
        },
        {
          "signalId": "ev-data-driven-decision",
          "description": "Connects data/information to business decision-making",
          "levels": ["mentions_data", "explains_decision_process", "proposes_metrics_framework"],
          "weight": 0.35
        }
      ],
      "maxFollowUps": 3,
      "timeBudgetSeconds": 300,
      "followUpBank": [
        {
          "followUpId": "s2-fu1",
          "type": "nudge",
          "prompt": "You've mentioned the roles. How would these people work together day-to-day on the dashboard project?",
          "triggerCondition": "names_roles BUT no collaboration_explanation"
        },
        {
          "followUpId": "s2-fu2",
          "type": "probe",
          "prompt": "What specific metrics would a regional manager find most useful on that dashboard?",
          "triggerCondition": "defines_bi BUT no specific_metrics"
        },
        {
          "followUpId": "s2-fu3",
          "type": "challenge",
          "prompt": "What if the data in the dashboard is wrong or outdated? How does that affect decision-making?",
          "triggerCondition": "missing_data_quality_awareness"
        }
      ],
      "transitionConditions": [
        {
          "id": "s2-sufficient",
          "expression": "signal_count >= 2 AND has_signal('ev-bi-understanding') AND any_signal_level >= 'explains_bi_value'"
        }
      ]
    },
    {
      "nodeId": "segment_3_data_governance",
      "type": "scenario_segment",
      "persona": "You are the General Manager. The conversation has turned to the responsibilities that come with collecting and analysing guest data.",
      "scenario": "The hotel collects guest preferences, booking patterns, and feedback data. You need to understand the candidate's awareness of data governance, privacy, and responsible analytics practices.",
      "conversationPrompt": "One more thing — we collect a lot of guest data for the loyalty programme and the BI dashboard. What should we be thinking about in terms of data governance and responsible use of that data?",
      "evidenceSignals": [
        {
          "signalId": "ev-data-governance-awareness",
          "description": "Demonstrates understanding of data governance principles",
          "levels": ["mentions_governance", "explains_frameworks", "proposes_governance_policy"],
          "weight": 0.3
        },
        {
          "signalId": "ev-privacy-ethics",
          "description": "Considers privacy, consent, and ethical use of data",
          "levels": ["mentions_privacy", "explains_consent_model", "proposes_ethical_framework"],
          "weight": 0.35
        },
        {
          "signalId": "ev-responsible-analytics",
          "description": "Understands responsible analytics practices (bias, transparency, accountability)",
          "levels": ["mentions_awareness", "explains_risks", "proposes_mitigation_strategy"],
          "weight": 0.35
        }
      ],
      "maxFollowUps": 3,
      "timeBudgetSeconds": 300,
      "followUpBank": [
        {
          "followUpId": "s3-fu1",
          "type": "nudge",
          "prompt": "That's an important point. Can you tell me more about how we'd actually implement that in practice?",
          "triggerCondition": "mentions_privacy BUT no implementation_detail"
        },
        {
          "followUpId": "s3-fu2",
          "type": "probe",
          "prompt": "What about the ethical side — are there things we could do with the data that we probably shouldn't?",
          "triggerCondition": "missing_ethical_consideration"
        },
        {
          "followUpId": "s3-fu3",
          "type": "challenge",
          "prompt": "If a guest asked to see all the data we hold about them, could we do it? What would that involve?",
          "triggerCondition": "missing_data_subject_rights"
        }
      ]
    },
    {
      "nodeId": "segment_4_change_loyalty",
      "type": "scenario_segment",
      "persona": "You are the General Manager. Final segment covering project change management and the loyalty programme logic.",
      "scenario": "The hotel is rolling out a new loyalty programme alongside the technology changes. You want to assess the candidate's understanding of change management, human factors, and how loyalty programme logic works.",
      "conversationPrompt": "Last topic. We're launching a new loyalty programme alongside all these tech changes. How would you approach the change management side of this, and can you walk me through how you think the loyalty programme logic should work?",
      "evidenceSignals": [
        {
          "signalId": "ev-change-management",
          "description": "Demonstrates understanding of change management principles in IT projects",
          "levels": ["mentions_change", "explains_approach", "proposes_change_plan"],
          "weight": 0.25
        },
        {
          "signalId": "ev-human-factors",
          "description": "Considers human factors (training, resistance, user adoption)",
          "levels": ["mentions_people", "analyses_barriers", "proposes_adoption_strategy"],
          "weight": 0.25
        },
        {
          "signalId": "ev-loyalty-programme-logic",
          "description": "Explains loyalty programme mechanics (points, tiers, rewards, data capture)",
          "levels": ["describes_basics", "explains_logic", "proposes_optimisation"],
          "weight": 0.25
        },
        {
          "signalId": "ev-integration-thinking",
          "description": "Connects loyalty programme to broader digital strategy",
          "levels": ["mentions_connection", "explains_integration", "proposes_synergies"],
          "weight": 0.25
        }
      ],
      "maxFollowUps": 3,
      "timeBudgetSeconds": 300,
      "followUpBank": [
        {
          "followUpId": "s4-fu1",
          "type": "nudge",
          "prompt": "Good thinking. How would you handle staff who are resistant to the new system?",
          "triggerCondition": "mentions_change BUT no resistance_handling"
        },
        {
          "followUpId": "s4-fu2",
          "type": "probe",
          "prompt": "Walk me through how a guest would earn and redeem points in this programme.",
          "triggerCondition": "describes_basics BUT no detailed_logic"
        },
        {
          "followUpId": "s4-fu3",
          "type": "challenge",
          "prompt": "How does the loyalty programme data feed back into the BI dashboard we discussed earlier?",
          "triggerCondition": "missing_integration_with_segment_2"
        }
      ]
    },
    {
      "nodeId": "closing",
      "type": "closing",
      "persona": "You are the General Manager wrapping up the meeting.",
      "conversationPrompt": "That's great, thank you for your time today. You've given me a lot to think about. We'll be in touch with feedback soon.",
      "timeBudgetSeconds": 30,
      "transitions": [{ "target": "end", "condition": "always" }]
    },
    {
      "nodeId": "end",
      "type": "end"
    }
  ],
  "transitions": [
    { "from": "scaffolding", "to": "segment_1_digital_foundations", "condition": "always" },
    { "from": "segment_1_digital_foundations", "to": "segment_2_is_roles_bi", "condition": "node_complete" },
    { "from": "segment_2_is_roles_bi", "to": "segment_3_data_governance", "condition": "node_complete" },
    { "from": "segment_3_data_governance", "to": "segment_4_change_loyalty", "condition": "node_complete" },
    { "from": "segment_4_change_loyalty", "to": "closing", "condition": "node_complete" },
    { "from": "closing", "to": "end", "condition": "always" }
  ]
}

10.11.4 Compiled Pipecat NodeConfig (Segment 1)

Section titled “10.11.4 Compiled Pipecat NodeConfig (Segment 1)”
# Compiled from specification node: segment_1_digital_foundations
segment_1_config = {
    "name": "segment_1_digital_foundations",

    "role_message": (
        "You are the General Manager of a mid-range hotel chain in New Zealand. "
        "You are meeting with a junior team member to discuss the hotel's digital "
        "transformation strategy. Be professional but approachable. Ask follow-up "
        "questions naturally, as a manager would in a real meeting."
    ),

    "task_messages": [
        {
            "role": "developer",
            "content": (
                "SCENARIO: The hotel chain is considering upgrading its booking system, "
                "implementing a mobile check-in app, and introducing IoT sensors for room "
                "management. You want to understand the candidate's grasp of digital "
                "foundations and operational trade-offs.\n\n"
                "OPENING: Thanks for coming in. We're looking at some big technology "
                "changes for the chain. Can you walk me through what you think are the "
                "key digital foundations we need to get right before we invest in new systems?\n\n"
                "EVIDENCE TO LISTEN FOR:\n"
                "- digital_transformation_understanding: Demonstrates understanding of what "
                "digital transformation means in a business context (levels: basic_awareness, "
                "applied_understanding, strategic_insight)\n"
                "- operational_trade_offs: Identifies trade-offs in technology adoption "
                "(levels: lists_factors, analyses_trade_offs, evaluates_with_evidence)\n"
                "- infrastructure_awareness: Recognises infrastructure prerequisites "
                "(levels: mentions_awareness, explains_dependencies, proposes_implementation_plan)\n"
                "- customer_impact: Considers how technology changes affect the customer "
                "experience (levels: acknowledges_impact, analyses_customer_journey, "
                "proposes_customer_centric_approach)\n\n"
                "TRANSVERSAL SKILLS TO OBSERVE:\n"
                "- critical_thinking: Does the candidate analyse information and form reasoned judgements?\n"
                "- professional_communication: Does the candidate communicate clearly in a professional context?\n"
                "- problem_solving: Does the candidate identify problems and propose practical solutions?\n\n"
                "CONSTRAINTS:\n"
                "- Maximum 3 follow-up questions\n"
                "- Time budget: 300 seconds\n"
                "- NEVER: reveal rubric, reveal score, suggest answer, mention other segments\n"
                "- If the candidate demonstrates basic_awareness, nudge toward applied_understanding\n\n"
                "After every candidate response, call report_observation with your "
                "assessment of the response, any evidence signals you detected, "
                "and what you want to say next."
            )
        }
    ],

    "functions": [report_observation],  # The ONE function

    "pre_actions": [
        {"type": "function", "handler": emit_node_entered},
        {"type": "tts_say", "text": None}  # Opening is in task_messages
    ],

    "post_actions": [
        {"type": "function", "handler": finalize_node}
    ],

    "context_strategy": ContextStrategyConfig(strategy=ContextStrategy.RESET),
}

10.11.5 Runtime Execution Trace (Segment 1, excerpt)

Section titled “10.11.5 Runtime Execution Trace (Segment 1, excerpt)”
T+00.000s  exam_state → in_progress
T+00.100s  node_entered: segment_1_digital_foundations
T+00.200s  LLM generates opening via report_observation(spokenText="Thanks for coming in...")
T+00.300s  TTS speaks opening
T+15.000s  Candidate responds: "I think the main thing is getting the network infrastructure
           right, and making sure the booking system can talk to the other systems..."
T+15.100s  STT transcript segment → Runtime Controller
T+16.000s  LLM calls report_observation({
             signals: [
               {signalType: "digital_transformation_understanding", rubricLevel: "basic_awareness",
                excerpt: "getting the network infrastructure right", confidence: 0.7},
               {signalType: "infrastructure_awareness", rubricLevel: "mentions_awareness",
                excerpt: "booking system can talk to the other systems", confidence: 0.8}
             ],
             answerQuality: "substantive",
             needsFollowUp: true, followUpType: "nudge",
             evidenceSufficient: false,
             anxietyDetected: false,
             spokenText: "That's a good starting point. Can you tell me more about why
                         those specific foundations matter — what could go wrong if we skip them?"
           })
T+16.100s  Runtime Controller:
           - Validates spokenText → PASS (no rubric leak, in persona, on topic)
           - Writes 2 evidence signals to ledger
           - Checks guardrails: followUpCount=0 < maxFollowUps=3 → ALLOW follow-up
           - followUpCount → 1
           - Returns {status: "followup_injected"}
T+16.200s  TTS speaks follow-up
T+35.000s  Candidate responds: "If we skip the infrastructure, the systems won't integrate
           properly. And from the customer side, if the app is slow or crashes during
           check-in, that's a terrible first impression..."
T+36.000s  LLM calls report_observation({
             signals: [
               {signalType: "operational_trade_offs", rubricLevel: "analyses_trade_offs",
                excerpt: "systems won't integrate properly", confidence: 0.85},
               {signalType: "customer_impact", rubricLevel: "analyses_customer_journey",
                excerpt: "app is slow or crashes during check-in", confidence: 0.9}
             ],
             answerQuality: "substantive",
             needsFollowUp: false,
             evidenceSufficient: true,
             anxietyDetected: false,
             spokenText: "Excellent point about the customer experience. That's exactly
                         the kind of thinking we need. Let's move on to the next topic."
           })
T+36.100s  Runtime Controller:
           - Validates spokenText → PASS
           - Writes 2 more evidence signals (total: 4)
           - Checks: evidenceSufficient=true AND signalTypes cover 4/4 targets → TRANSITION
           - Builds segment_2 config → flow_manager.set_node_from_config(segment_2_config)
T+36.200s  node_exit: segment_1_digital_foundations (reason: evidence_sufficient)
T+36.300s  node_entered: segment_2_is_roles_bi

10.11.6 Evidence Ledger Output (Segment 1)

Section titled “10.11.6 Evidence Ledger Output (Segment 1)”
{
  "nodeId": "segment_1_digital_foundations",
  "completionStatus": "completed",
  "durationSeconds": 36,
  "followUpsUsed": 1,
  "signals": [
    {
      "signalId": "sig-001",
      "signalType": "digital_transformation_understanding",
      "rubricLevel": "basic_awareness",
      "excerpt": "getting the network infrastructure right",
      "confidence": 0.7,
      "turnId": "turn-002",
      "timestamp": "2026-05-06T02:10:15Z",
      "transversalSkills": ["critical_thinking"]
    },
    {
      "signalId": "sig-002",
      "signalType": "infrastructure_awareness",
      "rubricLevel": "mentions_awareness",
      "excerpt": "booking system can talk to the other systems",
      "confidence": 0.8,
      "turnId": "turn-002",
      "timestamp": "2026-05-06T02:10:15Z",
      "transversalSkills": ["problem_solving"]
    },
    {
      "signalId": "sig-003",
      "signalType": "operational_trade_offs",
      "rubricLevel": "analyses_trade_offs",
      "excerpt": "systems won't integrate properly",
      "confidence": 0.85,
      "turnId": "turn-004",
      "timestamp": "2026-05-06T02:10:35Z",
      "transversalSkills": ["critical_thinking"]
    },
    {
      "signalId": "sig-004",
      "signalType": "customer_impact",
      "rubricLevel": "analyses_customer_journey",
      "excerpt": "app is slow or crashes during check-in",
      "confidence": 0.9,
      "turnId": "turn-004",
      "timestamp": "2026-05-06T02:10:35Z",
      "transversalSkills": ["critical_thinking", "problem_solving"]
    }
  ]
}

10.11.7 Key Differences from CS301 Example

Section titled “10.11.7 Key Differences from CS301 Example”
DimensionCS301 (§10.1–10.10)INFOSYS110 (§10.11)
Format2 questions with follow-ups4 scenario segments in one conversation
PersonaNeutral examinerHotel General Manager (role-play)
ScenarioNo scenarioFull professional scenario with context
Rubric levelsBinary (present/absent)Multi-level (basic → applied → strategic)
Transversal skillsNot trackedTracked across all segments
ScaffoldingNot included2-minute practice conversation
Follow-up strategyProbe/redirectProbe/redirect/scaffold/challenge/nudge
Conversation styleQ&AFree-flowing professional dialogue
EquityNot addressedcommunicationStyleIsLearningOutcome: false
Context strategyNot specifiedRESET between segments
Time budgetPer questionPer segment (5 min each)
T-120.000s  exam_state → scaffolding
T-120.100s  node_entered: scaffolding
T-120.200s  LLM: "Let's do a quick practice. Imagine you're telling a friend about
            a new app you've been using. What does it do and why do you like it?"
T-100.000s  Candidate: "Well, I use this app called Notion for organising my notes..."
T-080.000s  LLM: "Great, that's perfect. You're speaking clearly and giving good
            detail. In the real assessment, just keep doing what you're doing.
            We'll start now."
T-080.100s  exam_state → in_progress
T-080.200s  node_entered: segment_1_digital_foundations

Note: Scaffolding transcript is NOT included in the MarkingPackage. It exists only for candidate familiarisation and QA purposes. | raise_hand pauses timer | T+120.400s pause, T+130.400s resume |


Joughin (1998) identifies six dimensions of oral assessment. The following table maps the worked examples against these dimensions to demonstrate that the specification can represent the full oral assessment design space.

DimensionRangeCS301 (§10.1)INFOSYS110 (§10.11)Viva Voce (§10.13)OSCE (§10.14)ConVOE (§10.15)
1. Content TypeKnowledge; Applied Problem Solving; Interpersonal; IntrapersonalKnowledgeKnowledge + AppliedApplied + KnowledgeInterpersonal + AppliedKnowledge
2. InteractionPresentation ↔ DialogueStructured Q&AScenario dialogueDefence dialogueStation-based dialoguePresentation (recorded)
3. AuthenticityContextualised ↔ DecontextualisedDecontextualisedSemi-contextualisedSemi-contextualisedAuthentic (clinical)Decontextualised
4. StructureClosed ↔ OpenClosedModerately closedModerately openClosed (timed stations)Fully closed
5. ExaminersSelf; Peer; AuthoritySingle authoritySingle authority (role-play)Single authorityAuthority + SPAI examiner (automated)
6. OralityPurely oral ↔ SecondaryPurely oralPurely oralOral secondary (defends written work)Oral + physical demoPurely oral (recorded)

This coverage map demonstrates that the specification’s node-graph architecture, policy system, and evidence model can represent assessments spanning all six of Joughin’s dimensions.


10.13 Viva Voce Example — Oral Defence of Written Work

Section titled “10.13 Viva Voce Example — Oral Defence of Written Work”

Modality: Oral defence of a prior written submission. Joughin dimension coverage: “Orality as secondary” — the oral component supplements a written artifact. This is the modality Akimov & Malin (2020) implemented: students defended a written bond analysis project orally. Also exercises the “applied problem solving” content type and “moderately open” structure.

DimensionValue
CourseRES501 — Research Methods (Postgraduate)
Duration20 minutes
LanguageEnglish
Assessment typeViva voce — oral defence of a written research proposal
Prior workCandidate submits a 3,000-word research proposal 1 week before the exam
Examiner personaAcademic supervisor — supportive but rigorous
Max follow-ups per section3
Orality roleSecondary — oral component supplements the written proposal
┌──────────┐   ┌───────────────┐   ┌───────────────┐   ┌───────────────┐   ┌──────────┐   ┌─────┐
│ OPENING  │──▶│ METHODOLOGY   │──▶│ LIT REVIEW    │──▶│ FEASIBILITY   │──▶│ CLOSING  │──▶│ END │
│          │   │ DEFENCE       │   │ DEFENCE       │   │ & ETHICS      │   │          │   │     │
└──────────┘   └───────────────┘   └───────────────┘   └───────────────┘   └──────────┘   └─────┘
                 │ ≤3 FU              │ ≤3 FU              │ ≤3 FU
                 │ references         │ references         │ references
                 │ prior_work         │ prior_work         │ prior_work
{
  "irVersion": "exam-runtime-ir/0.1",
  "examId": "res501-viva-2026s1-001",
  "metadata": {
    "courseCode": "RES501",
    "assessmentType": "viva_voce",
    "durationMinutes": 20,
    "oralityRole": "secondary",
    "priorWorkRequired": true,
    "assessmentProfile": {
      "interactionMode": "structured_dialogue",
      "contentTypes": ["applied_problem_solving", "knowledge_understanding"],
      "structureLevel": "semi-structured",
      "authenticityLevel": "simulated",
      "assessmentPurpose": "summative"
    }
  },
  "priorWork": {
    "artifactId": "research-proposal-2026",
    "type": "written_paper",
    "title": "Research Proposal: Impact of AI on Assessment Design",
    "submissionDeadline": "2026-05-20T23:59:00Z",
    "maxWordCount": 3000,
    "availableToExaminer": true
  },
  "nodes": [
    {
      "nodeId": "methodology_defence",
      "type": "question",
      "questionStem": "Your proposal uses a mixed-methods design. Can you walk me through why you chose this approach over a purely quantitative or purely qualitative study?",
      "maxFollowUps": 3,
      "timeBudgetSeconds": 420,
      "evidenceTargets": [
        {
          "id": "ev-methodology-rationale",
          "description": "Articulates a clear rationale for mixed-methods design",
          "rubric": "References research questions, discusses complementarity of methods",
          "level": "required"
        },
        {
          "id": "ev-methodology-alternatives",
          "description": "Demonstrates awareness of alternative methodological approaches",
          "rubric": "Names at least one alternative and explains why it was not chosen",
          "level": "expected"
        },
        {
          "id": "ev-methodology-limitations",
          "description": "Acknowledges limitations of chosen approach",
          "rubric": "Identifies at least one limitation and discusses mitigation",
          "level": "expected"
        }
      ],
      "followUpBank": [
        {
          "followUpId": "meth-fu1",
          "type": "probe",
          "prompt": "In your proposal, you mention using thematic analysis for the qualitative data. Can you explain why thematic analysis rather than, say, grounded theory?",
          "referencesPriorWork": true
        },
        {
          "followUpId": "meth-fu2",
          "type": "challenge",
          "prompt": "A reviewer might argue that your sample size of 15 interviews is too small for meaningful qualitative analysis. How would you respond?",
          "referencesPriorWork": true
        }
      ],
      "guardrails": {
        "forbidden": ["reveal_rubric", "suggest_answer"],
        "mustReferencePriorWork": true
      }
    }
    // ... additional nodes for lit review defence, feasibility & ethics ...
  ]
}

10.13.4 Key Differences from CS301/INFOSYS110

Section titled “10.13.4 Key Differences from CS301/INFOSYS110”
FeatureCS301 / INFOSYS110Viva Voce
Prior workNoneWritten proposal submitted before exam
Orality rolePurely oralOral secondary (defends written work)
Follow-up referencesBased on candidate’s spoken answerReferences specific sections of the written proposal
Evidence targetsAssessed from speech aloneAssessed from speech + written work alignment
StructureClosed / moderately closedModerately open (examiner probes reasoning behind choices)
Content typeKnowledge + appliedApplied problem solving (research design justification)

The evidence ledger for a viva voce includes an additional priorWorkReference field linking evidence signals to specific sections of the written submission:

{
  "evidenceTargetId": "ev-methodology-rationale",
  "nodeId": "methodology_defence",
  "signal": "covered",
  "confidence": 0.88,
  "transcriptSpanIds": ["sp-015"],
  "priorWorkReference": {
    "section": "3.2 Research Design",
    "excerpt": "A mixed-methods approach is adopted to triangulate findings...",
    "alignment": "candidate's oral explanation consistent with written rationale"
  },
  "rationale": "Candidate articulated rationale that aligns with §3.2 of their proposal."
}

10.14 OSCE Station Example — Clinical Assessment

Section titled “10.14 OSCE Station Example — Clinical Assessment”

Modality: Objective Structured Clinical Examination station. Joughin dimension coverage: “Authenticity” at the authentic pole, “interpersonal competence” content type, “orality as secondary” (oral + physical demonstration). The versioning document (§09) references OSCE packages; this example demonstrates the full worked instantiation.

DimensionValue
CourseMED302 — Clinical Skills (Year 3 Medicine)
Duration8 minutes per station
LanguageEnglish
Assessment typeOSCE station — patient history-taking + clinical reasoning
Examiner personaStandardised Patient (SP) playing a 45-year-old with chest pain
Max follow-ups2 (time-constrained station)
Orality roleSecondary — oral interaction + physical examination demonstration
Professional bodyMapped to AMC (Australian Medical Council) clinical competencies
┌───────────┐   ┌───────────────────┐   ┌───────────────────┐   ┌──────────┐   ┌─────┐
│ STATION   │──▶│ HISTORY-TAKING    │──▶│ CLINICAL          │──▶│ CLOSING  │──▶│ END │
│ BRIEFING  │   │ (5 min)           │   │ REASONING (3 min) │   │          │   │     │
└───────────┘   └───────────────────┘   └───────────────────┘   └──────────┘   └─────┘
                   │ ≤2 FU                │ ≤2 FU
{
  "irVersion": "exam-runtime-ir/0.1",
  "examId": "med302-osce-station2-2026s1-001",
  "metadata": {
    "courseCode": "MED302",
    "assessmentType": "osce_station",
    "durationMinutes": 8,
    "stationNumber": 2,
    "clinicalDomain": "cardiology",
    "accreditationMapping": ["AMC-12.1", "AMC-12.3", "AMC-14.2"],
    "assessmentProfile": {
      "interactionMode": "structured_dialogue",
      "contentTypes": ["interpersonal_competence", "applied_problem_solving"],
      "structureLevel": "closed",
      "authenticityLevel": "authentic",
      "assessmentPurpose": "summative"
    }
  },
  "standardisedPatient": {
    "persona": "You are a 45-year-old office worker presenting to the emergency department with chest pain that started 2 hours ago. Describe your pain as dull, central, radiating to your left arm. You are anxious but cooperative. Answer questions accurately but do not volunteer information unless asked.",
    "trainingLevel": "certified_SP",
    "consistencyScript": true
  },
  "nodes": [
    {
      "nodeId": "history_taking",
      "type": "scenario_segment",
      "persona": "[Standardised Patient persona from above]",
      "conversationPrompt": "Good morning. Can you tell me what's brought you in today?",
      "maxFollowUps": 2,
      "timeBudgetSeconds": 300,
      "evidenceTargets": [
        {
          "id": "ev-history-chief-complaint",
          "description": "Elicits the chief complaint and characterises the pain (OPQRST)",
          "rubric": "Asks about onset, provocation, quality, radiation, severity, timing",
          "level": "required",
          "competencyMapping": "AMC-12.1"
        },
        {
          "id": "ev-history-risk-factors",
          "description": "Assesses cardiovascular risk factors",
          "rubric": "Asks about smoking, family history, hypertension, diabetes, cholesterol",
          "level": "required",
          "competencyMapping": "AMC-12.1"
        },
        {
          "id": "ev-history-differential",
          "description": "Considers differential diagnoses during history-taking",
          "rubric": "Asks questions that help distinguish cardiac from non-cardiac causes",
          "level": "expected",
          "competencyMapping": "AMC-12.3"
        },
        {
          "id": "ev-communication-compassion",
          "description": "Demonstrates compassionate, patient-centred communication",
          "rubric": "Uses open questions, active listening, acknowledges patient concerns, explains next steps",
          "level": "required",
          "competencyMapping": "AMC-14.2",
          "evidenceDimension": "interpersonal_competence"
        }
      ],
      "followUpBank": [
        {
          "followUpId": "hist-fu1",
          "type": "probe",
          "prompt": "You haven't asked about my family history yet. Is there anything else you'd like to know?",
          "triggerCondition": "missing_risk_factors"
        }
      ]
    },
    {
      "nodeId": "clinical_reasoning",
      "type": "question",
      "questionStem": "Based on the history you've taken, what are your top three differential diagnoses and how would you prioritise your investigations?",
      "maxFollowUps": 2,
      "timeBudgetSeconds": 180,
      "evidenceTargets": [
        {
          "id": "ev-differential-diagnosis",
          "description": "Generates appropriate differential diagnoses",
          "rubric": "Includes acute coronary syndrome, considers PE, aortic dissection, musculoskeletal",
          "level": "required",
          "competencyMapping": "AMC-12.3"
        },
        {
          "id": "ev-investigation-plan",
          "description": "Proposes a rational investigation plan",
          "rubric": "ECG, troponin, CXR as first-line; considers risk stratification",
          "level": "required",
          "competencyMapping": "AMC-12.3"
        }
      ]
    }
  ]
}

10.14.4 Key Differences from Other Examples

Section titled “10.14.4 Key Differences from Other Examples”
FeatureCS301 / INFOSYS110OSCE Station
AuthenticityDecontextualised / simulatedAuthentic (clinical setting)
Content typeKnowledge / appliedInterpersonal competence + applied
PersonaExaminer / managerStandardised Patient (trained actor)
Time constraintFlexible (10–20 min)Strict station time (8 min)
Competency mappingLearning outcomesProfessional body accreditation standards
Communication as evidenceOptional transversal skillRequired evidence target (AMC-14.2)
OralityPurely oralOral + physical demonstration

10.15 ConVOE Example — Concurrent Video-Based Oral Exam

Section titled “10.15 ConVOE Example — Concurrent Video-Based Oral Exam”

Modality: Concurrent Video-Based Oral Exam (ConVOE), as described by Bayley et al. (2024). All students simultaneously record video responses to questions via an LMS. This is the “presentation” pole of Joughin’s interaction dimension — one-way delivery with no real-time dialogue.

Key difference: The specification’s real-time dialogue model is adapted to support a “recorded response” mode where the candidate records answers without live examiner interaction.

DimensionValue
CourseBUS201 — Business Analytics (Year 2, 600+ students)
Duration20 minutes (4 questions × 5 min max each, no backtracking)
LanguageEnglish
Assessment typeConVOE — recorded video responses, no live dialogue
Interaction modePresentation (one-way, no follow-ups)
PlatformLMS with video recording integration
Cohort size620 students, concurrent administration
GradingParallel evaluation (all students graded on Q1 before Q2)
┌──────────┐   ┌──────┐   ┌──────┐   ┌──────┐   ┌──────┐   ┌──────────┐   ┌─────┐
│ BRIEFING │──▶│  Q1  │──▶│  Q2  │──▶│  Q3  │──▶│  Q4  │──▶│ CLOSING  │──▶│ END │
│ + PRACTICE│   │ (5m) │   │ (5m) │   │ (5m) │   │ (5m) │   │          │   │     │
└──────────┘   └──────┘   └──────┘   └──────┘   └──────┘   └──────────┘   └─────┘
                 no FU        no FU       no FU        no FU
                 no backtrack  no backtrack no backtrack  no backtrack
{
  "irVersion": "exam-runtime-ir/0.1",
  "examId": "bus201-convoe-2026s1-001",
  "metadata": {
    "courseCode": "BUS201",
    "assessmentType": "convoe",
    "durationMinutes": 20,
    "language": "en-CA",
    "expectedCandidateCount": 620,
    "concurrentAdministration": true,
    "assessmentProfile": {
      "interactionMode": "presentation",
      "contentTypes": ["knowledge_understanding", "applied_problem_solving"],
      "structureLevel": "closed",
      "authenticityLevel": "decontextualised",
      "assessmentPurpose": "summative"
    }
  },
  "administration": {
    "mode": "recorded_response",
    "backtrackingAllowed": false,
    "recordingFormat": "video",
    "maxResponseTimeSec": 300,
    "thinkingTimeSec": 30,
    "practiceQuestionEnabled": true,
    "questionRotation": {
      "enabled": true,
      "poolSizePerSlot": 5,
      "antiCollusionWindow": "same_day"
    }
  },
  "cohort": {
    "cohortId": "bus201-2026s1-cohort",
    "administrationWindow": {
      "startAt": "2026-06-01T09:00:00Z",
      "endAt": "2026-06-01T11:00:00Z"
    },
    "concurrent": true,
    "gradingStrategy": "parallel_evaluation"
  },
  "nodes": [
    {
      "nodeId": "briefing",
      "type": "opening",
      "prompt": "Welcome to your BUS201 oral assessment. You will answer 4 questions. For each question, you have up to 5 minutes to record your video response. You cannot go back to previous questions. A practice question is available before you begin."
    },
    {
      "nodeId": "q1",
      "type": "question",
      "questionStem": "Explain the difference between supervised and unsupervised machine learning. Give a business example of each.",
      "maxFollowUps": 0,
      "timeBudgetSeconds": 300,
      "recordingRequired": true,
      "evidenceTargets": [
        {
          "id": "ev-ml-types",
          "description": "Distinguishes supervised from unsupervised learning",
          "level": "required"
        },
        {
          "id": "ev-ml-examples",
          "description": "Provides valid business examples for both types",
          "level": "required"
        }
      ],
      "questionPool": {
        "poolId": "q1-pool",
        "variants": [
          { "variantId": "q1-v1", "prompt": "Explain the difference between supervised and unsupervised machine learning. Give a business example of each." },
          { "variantId": "q1-v2", "prompt": "Compare classification and clustering algorithms. When would a business use each approach?" },
          { "variantId": "q1-v3", "prompt": "What is the role of labelled data in machine learning? Provide business scenarios where labelled data is available versus unavailable." }
        ],
        "drawCount": 1
      }
    }
    // ... additional question nodes with pools ...
  ]
}

10.15.4 Key Differences from Dialogue-Based Examples

Section titled “10.15.4 Key Differences from Dialogue-Based Examples”
FeatureCS301 / INFOSYS110 (Dialogue)ConVOE (Presentation)
Interaction modeDialogue (multi-turn)Presentation (one-way recording)
Follow-upsAllowed (1–3 per node)None (maxFollowUps: 0)
Candidate commandsrepeat, clarify, pauseNone (no live examiner)
Scalability1 candidate per session620 candidates concurrent
Question poolsFixed questionsRandomised from pool (anti-collusion)
GradingSequential per sessionParallel evaluation (by question)
BacktrackingNot applicableExplicitly forbidden
Practice sessionOptional scaffoldingBuilt-in practice question
Reliability concernInter-case (follow-up variance)Inter-case (question difficulty equivalence)
Academic integrityConversation fingerprintQuestion rotation + time limit + video recording

The ConVOE format exercises the specification’s scalability features:

  • Question pools (questionPool): Each question slot draws from a pool of equivalent variants, mitigating question-sharing (Bayley et al., 2024, p. 165: “students posted ConVOE questions to an online group chat”).
  • Cohort management: The cohort entity groups 620 concurrent sessions and enables batch grading.
  • Parallel evaluation: The gradingStrategy: "parallel_evaluation" ensures graders assess all candidates on Q1 before moving to Q2, maintaining consistency (Bayley et al., 2024, p. 163).
  • No follow-ups: The presentation interaction mode eliminates inter-case reliability concerns from dialogue variance.
VersionDateChanges
v0.2.02026-06-30Updated examples to reflect IOA-ORM terminology and new schema fields.
v0.1.02026-05-06Initial release.