Worked Examples
Status
Section titled “Status”Draft · v0.2.0 · 2026-06-30
This chapter presents a complete, minimal but realistic example that exercises every major feature of the IOA-ORM. It is designed to be self-contained: a reader who has only skimmed the preceding chapters should be able to follow it end-to-end.
10.1 Scenario
Section titled “10.1 Scenario”| Dimension | Value |
|---|---|
| Course | CS301 — Operating Systems (Year 3 undergraduate) |
| Duration | 10 minutes |
| Language | English |
| Assessment type | Interactive oral — 2 structured questions with follow-ups |
| Examiner persona | Supportive, encouraging; asks for clarification when answers are vague |
| Max follow-ups per question | 2 |
| Candidate commands available | repeat, clarification, raise_hand |
Flow Shape
Section titled “Flow Shape”┌─────────┐ ┌──────────────┐ ┌──────────────┐ ┌─────────┐ ┌─────┐
│ OPENING │──▶│ QUESTION_1 │──▶│ QUESTION_2 │──▶│ CLOSING │──▶│ END │
└─────────┘ └──────────────┘ └──────────────┘ └─────────┘ └─────┘
│ (≤2 follow-ups) │ (≤2 follow-ups)
└───────────────────┘
Learning Outcomes Assessed
Section titled “Learning Outcomes Assessed”- LO-1: Explain the role of process scheduling in an OS (Knowledge)
- LO-2: Compare scheduling algorithms and justify a choice (Analysis / Evaluation)
10.2 Authoring-Level Description
Section titled “10.2 Authoring-Level Description”Lecturer’s view — what the examiner writes in the Assessment Studio.
Opening (60 s)
Section titled “Opening (60 s)”“Welcome to your oral assessment for CS301. I’ll ask you two questions about operating systems. Feel free to ask me to repeat or clarify at any time. You may raise your hand if you need a moment. Let’s begin.”
Question 1 — Process Scheduling (target 3 min, max 4 min)
Section titled “Question 1 — Process Scheduling (target 3 min, max 4 min)”Stem: “Can you explain what process scheduling means in the context of an operating system, and why it matters?”
Rubric signal: LO-1 — explanation of scheduling concept, mention of preemptive vs cooperative, context switch cost.
Follow-up 1 (if answer is vague or misses preemptive/cooperative): “You mentioned scheduling. Can you elaborate on the difference between preemptive and cooperative scheduling?”
Follow-up 2 (if still incomplete on context switch): “How does context switching fit into this picture?”
Transition condition: Candidate has addressed scheduling concept + at least one of {preemptive/cooperative, context switch}. OR max follow-ups exhausted. OR time budget exceeded.
Question 2 — Scheduling Algorithm Comparison (target 4 min, max 5 min)
Section titled “Question 2 — Scheduling Algorithm Comparison (target 4 min, max 5 min)”Stem: “Suppose you’re designing a scheduler for an interactive desktop OS. Would you choose Round Robin or Shortest Job First? Why?”
Rubric signal: LO-2 — comparison of algorithms, awareness of starvation risk, response-time argument.
Follow-up 1 (if candidate doesn’t address starvation): “What problem could arise with the algorithm you didn’t choose?”
Follow-up 2 (if candidate still hasn’t mentioned response time): “From the user’s perspective, how would response time be affected?”
Transition condition: Candidate has compared at least one trade-off. OR max follow-ups exhausted. OR time budget exceeded.
Closing (60 s)
Section titled “Closing (60 s)”“Thank you. That concludes your oral assessment. Your responses will be reviewed and you’ll receive your results within 5 working days.”
10.3 Compiled Domain Specification (ExamRuntimeIR)
Section titled “10.3 Compiled Domain Specification (ExamRuntimeIR)”Note on schema conformance: The examples in this section use a simplified authoring-level representation for readability. The canonical TypeScript schema is defined in 02-schema.md. Key field name mappings:
Example field Canonical schema field Section type(on nodes)kind§3 ExamRuntimeNodeKindquestionStempromptSeed§4 ExamRuntimeNodetimeBudgetSecondstimeBudgetMs(milliseconds)§4 followUps(inline array)followUpPolicy+candidateCommands§10, §13 guardrails.forbiddencandidateCommands.forbidden§13 transitionPolicy.allowedTargetstransitionsarray§11 irVersion: "1.0.0""exam-runtime-ir/0.1"§09 The event stream (§10.5) and evidence ledger (§10.6) use canonical field names.
{
"irVersion": "exam-runtime-ir/0.1",
"examId": "cs301-oral-2026s1-001",
"version": "3",
"metadata": {
"courseCode": "CS301",
"courseName": "Operating Systems",
"assessmentType": "interactive_oral",
"durationMinutes": 10,
"language": "en-GB",
"examinerPersona": {
"tone": "supportive_encouraging",
"style": "asks_for_clarification_when_vague"
}
},
"timeBudget": {
"totalSeconds": 600,
"nodeBudgets": {
"opening": 60,
"q1": 180,
"q2": 240,
"closing": 60,
"end": 0
},
"overrunPolicy": "warn_at_80pct_hard_at_100pct"
},
"nodes": [
{
"nodeId": "opening",
"type": "opening",
"prompt": "Welcome to your oral assessment for CS301. I'll ask you two questions about operating systems. Feel free to ask me to repeat or clarify at any time. You may raise your hand if you need a moment. Let's begin.",
"transitions": [
{
"target": "q1",
"condition": "always",
"trigger": "examiner_action"
}
]
},
{
"nodeId": "q1",
"type": "question",
"questionStem": "Can you explain what process scheduling means in the context of an operating system, and why it matters?",
"maxFollowUps": 2,
"timeBudgetSeconds": 240,
"learningOutcomes": ["LO-1"],
"evidenceTargets": [
{
"id": "ev-q1-scheduling-concept",
"description": "Explains what process scheduling is",
"rubric": "Mentions CPU allocation, multiprogramming context",
"level": "required"
},
{
"id": "ev-q1-preemptive-cooperative",
"description": "Distinguishes preemptive vs cooperative scheduling",
"rubric": "Defines both, gives example or explains trade-off",
"level": "expected"
},
{
"id": "ev-q1-context-switch",
"description": "Mentions context switch cost or mechanism",
"rubric": "Explains save/restore state, overhead awareness",
"level": "expected"
}
],
"followUps": [
{
"followUpId": "q1-fu1",
"ordinal": 1,
"prompt": "You mentioned scheduling. Can you elaborate on the difference between preemptive and cooperative scheduling?",
"triggerCondition": "answer_is_vague OR missing_preemptive_cooperative",
"evidenceTargets": ["ev-q1-preemptive-cooperative"]
},
{
"followUpId": "q1-fu2",
"ordinal": 2,
"prompt": "How does context switching fit into this picture?",
"triggerCondition": "missing_context_switch",
"evidenceTargets": ["ev-q1-context-switch"]
}
],
"transitionPolicy": {
"allowedTargets": ["q2"],
"decisionMode": "runtime_approval",
"conditions": [
{
"id": "q1-sufficient",
"description": "Sufficient evidence collected OR follow-ups exhausted OR time budget exceeded",
"expression": "evidence_covered(['ev-q1-scheduling-concept']) AND (evidence_covered(['ev-q1-preemptive-cooperative','ev-q1-context-switch']) OR follow_up_count >= maxFollowUps OR time_budget_exceeded)"
}
]
},
"guardrails": {
"forbidden": ["reveal_rubric", "reveal_score", "suggest_answer"],
"forbidden_topics": ["exam_format_policy", "grading_threshold"],
"maxCandidateSilenceSeconds": 15,
"silenceAction": "gentle_prompt"
}
},
{
"nodeId": "q2",
"type": "question",
"questionStem": "Suppose you're designing a scheduler for an interactive desktop OS. Would you choose Round Robin or Shortest Job First? Why?",
"maxFollowUps": 2,
"timeBudgetSeconds": 300,
"learningOutcomes": ["LO-2"],
"evidenceTargets": [
{
"id": "ev-q2-algorithm-choice",
"description": "Chooses an algorithm and provides a rationale",
"rubric": "Names algorithm, links to interactive desktop context",
"level": "required"
},
{
"id": "ev-q2-starvation",
"description": "Addresses starvation risk of the other algorithm",
"rubric": "Explains SJF starvation scenario or RR fairness trade-off",
"level": "expected"
},
{
"id": "ev-q2-response-time",
"description": "Considers response time from user perspective",
"rubric": "Mentions interactive responsiveness, jitter, or latency",
"level": "expected"
}
],
"followUps": [
{
"followUpId": "q2-fu1",
"ordinal": 1,
"prompt": "What problem could arise with the algorithm you didn't choose?",
"triggerCondition": "missing_starvation",
"evidenceTargets": ["ev-q2-starvation"]
},
{
"followUpId": "q2-fu2",
"ordinal": 2,
"prompt": "From the user's perspective, how would response time be affected?",
"triggerCondition": "missing_response_time",
"evidenceTargets": ["ev-q2-response-time"]
}
],
"transitionPolicy": {
"allowedTargets": ["closing"],
"decisionMode": "runtime_approval",
"conditions": [
{
"id": "q2-sufficient",
"description": "Sufficient evidence collected OR follow-ups exhausted OR time budget exceeded",
"expression": "evidence_covered(['ev-q2-algorithm-choice']) AND (evidence_covered(['ev-q2-starvation','ev-q2-response-time']) OR follow_up_count >= maxFollowUps OR time_budget_exceeded)"
}
]
},
"guardrails": {
"forbidden": ["reveal_rubric", "reveal_score", "suggest_answer"],
"forbidden_topics": ["exam_format_policy", "grading_threshold"],
"maxCandidateSilenceSeconds": 15,
"silenceAction": "gentle_prompt"
}
},
{
"nodeId": "closing",
"type": "closing",
"prompt": "Thank you. That concludes your oral assessment. Your responses will be reviewed and you'll receive your results within 5 working days.",
"transitions": [
{
"target": "end",
"condition": "always",
"trigger": "examiner_action"
}
]
},
{
"nodeId": "end",
"type": "end",
"postExamAction": "trigger_marking_runtime"
}
],
"candidateCommands": {
"repeat": {
"description": "Candidate asks examiner to repeat the current question or last statement",
"runtimeAction": "re_prompt_current",
"costsFollowUp": false,
"maxPerNode": 3
},
"clarification": {
"description": "Candidate asks for clarification of a term or concept",
"runtimeAction": "llm_provides_clarification_within_guardrails",
"costsFollowUp": false,
"maxPerNode": 2,
"guardrail": "must_not_reveal_answer_or_rubric"
},
"raise_hand": {
"description": "Candidate signals they need a moment (pause timer)",
"runtimeAction": "pause_time_budget",
"costsFollowUp": false,
"maxPerNode": 2,
"pauseDurationSeconds": 10
}
}
}
10.4 Simplified Pipecat Adapter Output
Section titled “10.4 Simplified Pipecat Adapter Output”The adapter compiles the domain specification into a Pipecat FlowManager-compatible configuration. The runtime controller layer sits between them, so this is an intermediate representation — not the source of truth.
{
"flow": {
"initial_node": "opening",
"nodes": {
"opening": {
"task_messages": [
{
"role": "system",
"content": "You are an oral exam examiner. Be supportive and encouraging. ..."
},
{
"role": "assistant",
"content": "Welcome to your oral assessment for CS301. I'll ask you two questions about operating systems. Feel free to ask me to repeat or clarify at any time. You may raise your hand if you need a moment. Let's begin."
}
],
"pre_actions": [
{
"type": "emit_event",
"event": { "type": "node_entered", "nodeId": "opening" }
}
],
"post_actions": [
{
"type": "emit_event",
"event": { "type": "node_exited", "nodeId": "opening" }
}
],
"edges": [
{
"target": "q1",
"transition_to": "q1",
"interruptible": false
}
]
},
"q1": {
"task_messages": [
{
"role": "system",
"content": "You are an oral exam examiner assessing LO-1: Process scheduling. Ask the stem question, then listen carefully. You may ask up to 2 follow-ups if the answer is vague or incomplete. Never reveal the rubric or suggest answers. If the candidate says 'repeat', re-ask the question. If they say 'clarification', explain a term without giving the answer. If they say 'raise_hand', wait 10 seconds. ..."
},
{
"role": "assistant",
"content": "Can you explain what process scheduling means in the context of an operating system, and why it matters?"
}
],
"pre_actions": [
{
"type": "emit_event",
"event": { "type": "node_entered", "nodeId": "q1" }
},
{
"type": "set_runtime_state",
"key": "currentNode.followUpCount",
"value": 0
}
],
"post_actions": [
{
"type": "emit_event",
"event": { "type": "node_exited", "nodeId": "q1" }
}
],
"edges": [
{
"target": "q2",
"transition_to": "q2",
"interruptible": false,
"condition": "runtime_approves_transition"
}
],
"runtime_config": {
"maxFollowUps": 2,
"timeBudgetSeconds": 240,
"evidenceTargets": [
"ev-q1-scheduling-concept",
"ev-q1-preemptive-cooperative",
"ev-q1-context-switch"
]
}
},
"q2": {
"task_messages": [
{
"role": "system",
"content": "You are an oral exam examiner assessing LO-2: Scheduling algorithm comparison. Ask the stem question. You may ask up to 2 follow-ups. ..."
},
{
"role": "assistant",
"content": "Suppose you're designing a scheduler for an interactive desktop OS. Would you choose Round Robin or Shortest Job First? Why?"
}
],
"pre_actions": [
{
"type": "emit_event",
"event": { "type": "node_entered", "nodeId": "q2" }
},
{
"type": "set_runtime_state",
"key": "currentNode.followUpCount",
"value": 0
}
],
"post_actions": [
{
"type": "emit_event",
"event": { "type": "node_exited", "nodeId": "q2" }
}
],
"edges": [
{
"target": "closing",
"transition_to": "closing",
"interruptible": false,
"condition": "runtime_approves_transition"
}
],
"runtime_config": {
"maxFollowUps": 2,
"timeBudgetSeconds": 300,
"evidenceTargets": [
"ev-q2-algorithm-choice",
"ev-q2-starvation",
"ev-q2-response-time"
]
}
},
"closing": {
"task_messages": [
{
"role": "system",
"content": "You are concluding the oral exam. Deliver the closing statement. Do not discuss performance."
},
{
"role": "assistant",
"content": "Thank you. That concludes your oral assessment. Your responses will be reviewed and you'll receive your results within 5 working days."
}
],
"pre_actions": [
{
"type": "emit_event",
"event": { "type": "node_entered", "nodeId": "closing" }
}
],
"post_actions": [
{
"type": "emit_event",
"event": { "type": "node_exited", "nodeId": "closing" }
}
],
"edges": [
{
"target": "end",
"transition_to": "end",
"interruptible": false
}
]
},
"end": {
"task_messages": [],
"post_actions": [
{
"type": "emit_event",
"event": { "type": "exam_completed" }
},
{
"type": "trigger_marking_runtime",
"payload": { "source": "exam_runtime" }
}
]
}
}
}
}
10.5 Runtime Event Stream
Section titled “10.5 Runtime Event Stream”The following is the chronological event stream for a realistic exam session.
Timestamps are relative to exam start (T=0). Some events are omitted for brevity;
a … indicates a gap.
T+0.000s bot_ready
{ "examId": "cs301-oral-2026s1-001", "sessionId": "sess-7a3f" }
T+0.120s node_entered
{ "nodeId": "opening", "nodeType": "opening" }
T+0.200s transcript_delta
{ "nodeId": "opening", "speaker": "examiner",
"text": "Welcome to your oral assessment for CS301.", "isFinal": false }
T+2.400s transcript_final
{ "nodeId": "opening", "speaker": "examiner",
"text": "Welcome to your oral assessment for CS301. I'll ask you two questions about operating systems. Feel free to ask me to repeat or clarify at any time. You may raise your hand if you need a moment. Let's begin.",
"spanId": "sp-001" }
T+2.401s node_exited
{ "nodeId": "opening" }
T+2.402s node_entered
{ "nodeId": "q1", "nodeType": "question" }
T+2.500s node_progress
{ "nodeId": "q1", "followUpCount": 0, "maxFollowUps": 2,
"evidenceCovered": [], "timeBudgetRemainingSeconds": 240 }
T+2.600s transcript_delta
{ "nodeId": "q1", "speaker": "examiner",
"text": "Can you explain what process scheduling means", "isFinal": false }
T+5.800s transcript_final
{ "nodeId": "q1", "speaker": "examiner",
"text": "Can you explain what process scheduling means in the context of an operating system, and why it matters?",
"spanId": "sp-002" }
T+8.200s transcript_delta
{ "nodeId": "q1", "speaker": "candidate",
"text": "Process scheduling is when the OS decides", "isFinal": false }
T+14.600s transcript_final
{ "nodeId": "q1", "speaker": "candidate",
"text": "Process scheduling is when the operating system decides which process gets to use the CPU at any given time. It's important because there are usually more processes than CPUs, so the OS has to manage sharing. It also helps with making sure important tasks get done first.",
"spanId": "sp-003" }
T+14.700s evidence_signal
{ "nodeId": "q1", "evidenceTargetId": "ev-q1-scheduling-concept",
"transcriptSpanId": "sp-003",
"signal": "covered",
"confidence": 0.92,
"rationale": "Candidate described CPU allocation and multiprogramming context." }
T+14.701s node_progress
{ "nodeId": "q1", "followUpCount": 0, "maxFollowUps": 2,
"evidenceCovered": ["ev-q1-scheduling-concept"],
"timeBudgetRemainingSeconds": 225 }
T+14.800s transition_decision
{ "nodeId": "q1", "decision": "follow_up",
"reason": "Evidence for ev-q1-preemptive-cooperative and ev-q1-context-switch not yet covered. Follow-up 1 trigger condition 'missing_preemptive_cooperative' met.",
"followUpOrdinal": 1 }
T+15.000s transcript_delta
{ "nodeId": "q1", "speaker": "examiner",
"text": "You mentioned scheduling. Can you elaborate", "isFinal": false }
T+17.200s transcript_final
{ "nodeId": "q1", "speaker": "examiner",
"text": "You mentioned scheduling. Can you elaborate on the difference between preemptive and cooperative scheduling?",
"spanId": "sp-004" }
T+17.300s node_progress
{ "nodeId": "q1", "followUpCount": 1, "maxFollowUps": 2,
"evidenceCovered": ["ev-q1-scheduling-concept"],
"timeBudgetRemainingSeconds": 223 }
T+19.800s transcript_delta
{ "nodeId": "q1", "speaker": "candidate",
"text": "Sorry, can you repeat the question?", "isFinal": false }
T+21.400s transcript_final
{ "nodeId": "q1", "speaker": "candidate",
"text": "Sorry, can you repeat the question?",
"spanId": "sp-005" }
T+21.500s candidate_command
{ "nodeId": "q1", "command": "repeat",
"triggeredBy": "candidate_utterance",
"rawText": "Sorry, can you repeat the question?",
"costsFollowUp": false,
"followUpCountAfter": 1 }
T+21.600s transcript_delta
{ "nodeId": "q1", "speaker": "examiner",
"text": "Of course. I'm asking about the difference", "isFinal": false }
T+23.800s transcript_final
{ "nodeId": "q1", "speaker": "examiner",
"text": "Of course. I'm asking about the difference between preemptive and cooperative scheduling. In preemptive scheduling, the OS can interrupt a running process. In cooperative scheduling, the process must voluntarily yield. Can you tell me more about that?",
"spanId": "sp-006" }
T+23.900s node_progress
{ "nodeId": "q1", "followUpCount": 1, "maxFollowUps": 2,
"evidenceCovered": ["ev-q1-scheduling-concept"],
"timeBudgetRemainingSeconds": 216 }
T+27.100s transcript_delta
{ "nodeId": "q1", "speaker": "candidate",
"text": "Right, so in preemptive the OS can stop", "isFinal": false }
T+36.500s transcript_final
{ "nodeId": "q1", "speaker": "candidate",
"text": "Right, so in preemptive scheduling the OS can stop a process at any time and switch to another one. This is what most modern OSes like Linux and Windows use. Cooperative scheduling is where the process has to give up control itself, like in older versions of Windows where if a program froze, the whole system could hang.",
"spanId": "sp-007" }
T+36.600s evidence_signal
{ "nodeId": "q1", "evidenceTargetId": "ev-q1-preemptive-cooperative",
"transcriptSpanId": "sp-007",
"signal": "covered",
"confidence": 0.95,
"rationale": "Defined both types with real-world OS examples." }
T+36.601s node_progress
{ "nodeId": "q1", "followUpCount": 1, "maxFollowUps": 2,
"evidenceCovered": ["ev-q1-scheduling-concept", "ev-q1-preemptive-cooperative"],
"timeBudgetRemainingSeconds": 203 }
T+36.700s transition_decision
{ "nodeId": "q1", "decision": "follow_up",
"reason": "ev-q1-context-switch still not covered. Follow-up 2 trigger condition 'missing_context_switch' met.",
"followUpOrdinal": 2 }
T+37.000s transcript_final
{ "nodeId": "q1", "speaker": "examiner",
"text": "Great examples. How does context switching fit into this picture?",
"spanId": "sp-008" }
T+37.100s node_progress
{ "nodeId": "q1", "followUpCount": 2, "maxFollowUps": 2,
"evidenceCovered": ["ev-q1-scheduling-concept", "ev-q1-preemptive-cooperative"],
"timeBudgetRemainingSeconds": 203 }
T+42.300s transcript_final
{ "nodeId": "q1", "speaker": "candidate",
"text": "Context switching is when the OS saves the state of the current process and loads the state of the next one. It's the mechanism that makes preemptive scheduling possible. But it has overhead — saving registers, updating memory maps — so you don't want to do it too frequently.",
"spanId": "sp-009" }
T+42.400s evidence_signal
{ "nodeId": "q1", "evidenceTargetId": "ev-q1-context-switch",
"transcriptSpanId": "sp-009",
"signal": "covered",
"confidence": 0.90,
"rationale": "Described save/restore mechanism and overhead awareness." }
T+42.401s node_progress
{ "nodeId": "q1", "followUpCount": 2, "maxFollowUps": 2,
"evidenceCovered": ["ev-q1-scheduling-concept", "ev-q1-preemptive-cooperative", "ev-q1-context-switch"],
"timeBudgetRemainingSeconds": 197 }
T+42.500s transition_decision
{ "nodeId": "q1", "decision": "move_to_next_node",
"reason": "All expected evidence covered. Transition condition 'q1-sufficient' satisfied.",
"targetNodeId": "q2" }
T+42.600s node_exited
{ "nodeId": "q1" }
T+42.700s node_entered
{ "nodeId": "q2", "nodeType": "question" }
T+42.800s node_progress
{ "nodeId": "q2", "followUpCount": 0, "maxFollowUps": 2,
"evidenceCovered": [], "timeBudgetRemainingSeconds": 240 }
T+43.000s transcript_final
{ "nodeId": "q2", "speaker": "examiner",
"text": "Suppose you're designing a scheduler for an interactive desktop OS. Would you choose Round Robin or Shortest Job First? Why?",
"spanId": "sp-010" }
…
T+120.400s candidate_command
{ "nodeId": "q2", "command": "raise_hand",
"triggeredBy": "candidate_utterance",
"rawText": "Can I have a moment to think?",
"costsFollowUp": false,
"followUpCountAfter": 0,
"pauseDurationSeconds": 10 }
T+120.500s time_budget_paused
{ "nodeId": "q2", "pauseUntil": "T+130.400" }
T+130.400s time_budget_resumed
{ "nodeId": "q2", "timeBudgetRemainingSeconds": 170 }
…
T+240.000s time_budget_exceeded
{ "nodeId": "q2" }
T+240.100s transition_decision
{ "nodeId": "q2", "decision": "move_to_next_node",
"reason": "Time budget exceeded. Evidence collected: ev-q2-algorithm-choice (covered), ev-q2-starvation (covered), ev-q2-response-time (not covered). Hard move enforced by overrun policy.",
"targetNodeId": "closing" }
T+240.200s node_exited
{ "nodeId": "q2" }
T+240.300s node_entered
{ "nodeId": "closing", "nodeType": "closing" }
T+241.000s transcript_final
{ "nodeId": "closing", "speaker": "examiner",
"text": "Thank you. That concludes your oral assessment. Your responses will be reviewed and you'll receive your results within 5 working days.",
"spanId": "sp-020" }
T+243.000s node_exited
{ "nodeId": "closing" }
T+243.100s node_entered
{ "nodeId": "end", "nodeType": "end" }
T+243.200s exam_completed
{ "examId": "cs301-oral-2026s1-001", "sessionId": "sess-7a3f",
"totalDurationSeconds": 243.2,
"nodesVisited": ["opening", "q1", "q2", "closing", "end"],
"totalFollowUpsUsed": 3 }
10.6 Evidence Ledger
Section titled “10.6 Evidence Ledger”The evidence ledger is populated from evidence_signal events and persisted by
the event store. It is the primary input to the marking runtime.
{
"examId": "cs301-oral-2026s1-001",
"sessionId": "sess-7a3f",
"candidateId": "stu-202400042",
"generatedAt": "2026-05-06T02:07:43.200Z",
"entries": [
{
"evidenceTargetId": "ev-q1-scheduling-concept",
"nodeId": "q1",
"learningOutcome": "LO-1",
"signal": "covered",
"confidence": 0.92,
"transcriptSpanIds": ["sp-003"],
"transcriptExcerpt": "Process scheduling is when the operating system decides which process gets to use the CPU at any given time. It's important because there are usually more processes than CPUs, so the OS has to manage sharing.",
"rationale": "Candidate described CPU allocation and multiprogramming context.",
"timestamp": "T+14.700s"
},
{
"evidenceTargetId": "ev-q1-preemptive-cooperative",
"nodeId": "q1",
"learningOutcome": "LO-1",
"signal": "covered",
"confidence": 0.95,
"transcriptSpanIds": ["sp-007"],
"transcriptExcerpt": "In preemptive scheduling the OS can stop a process at any time and switch to another one. This is what most modern OSes like Linux and Windows use. Cooperative scheduling is where the process has to give up control itself.",
"rationale": "Defined both types with real-world OS examples.",
"timestamp": "T+36.600s"
},
{
"evidenceTargetId": "ev-q1-context-switch",
"nodeId": "q1",
"learningOutcome": "LO-1",
"signal": "covered",
"confidence": 0.90,
"transcriptSpanIds": ["sp-009"],
"transcriptExcerpt": "Context switching is when the OS saves the state of the current process and loads the state of the next one. It's the mechanism that makes preemptive scheduling possible. But it has overhead.",
"rationale": "Described save/restore mechanism and overhead awareness.",
"timestamp": "T+42.400s"
},
{
"evidenceTargetId": "ev-q2-algorithm-choice",
"nodeId": "q2",
"learningOutcome": "LO-2",
"signal": "covered",
"confidence": 0.88,
"transcriptSpanIds": ["sp-012"],
"transcriptExcerpt": "I would choose Round Robin because it gives each process a fair time slice, which means the system stays responsive to user input even under load.",
"rationale": "Named algorithm, linked to interactive context.",
"timestamp": "T+62.300s"
},
{
"evidenceTargetId": "ev-q2-starvation",
"nodeId": "q2",
"learningOutcome": "LO-2",
"signal": "covered",
"confidence": 0.85,
"transcriptSpanIds": ["sp-015"],
"transcriptExcerpt": "SJF can lead to starvation if short jobs keep arriving — a long job might never get scheduled. That's why it's not great for interactive systems where any task could need attention.",
"rationale": "Explained starvation scenario linked to interactive context.",
"timestamp": "T+98.700s"
},
{
"evidenceTargetId": "ev-q2-response-time",
"nodeId": "q2",
"learningOutcome": "LO-2",
"signal": "not_covered",
"confidence": null,
"transcriptSpanIds": [],
"transcriptExcerpt": null,
"rationale": "Time budget exhausted before evidence could be collected.",
"timestamp": "T+240.100s"
}
],
"summary": {
"totalTargets": 6,
"covered": 5,
"notCovered": 1,
"coverageRate": 0.833
}
}
10.7 Candidate Command Example
Section titled “10.7 Candidate Command Example”Scenario: Candidate asks to repeat the question
Section titled “Scenario: Candidate asks to repeat the question”The candidate says: “Sorry, can you repeat the question?”
Runtime behaviour:
- STT produces
transcript_finalwith the candidate’s utterance (spansp-005). - The runtime controller’s command classifier detects the
repeatintent. - A
candidate_commandevent is emitted (see §10.5, T+21.500s). - The runtime does NOT increment
followUpCount— this is a clarification request, not a substantive answer attempt. - The LLM re-asks the current follow-up prompt with slight rephrasing.
- A new
transcript_finalis emitted for the examiner’s re-prompt (spansp-006).
Key invariant: followUpCount remains at 1 after the repeat. The candidate’s
ability to answer is not penalised by a repeat request.
Scenario: Candidate asks for clarification of a term
Section titled “Scenario: Candidate asks for clarification of a term”The candidate says: “What do you mean by starvation?”
Runtime behaviour:
- Command classifier detects
clarificationintent. candidate_commandevent emitted withcommand: "clarification".- LLM provides a brief, rubric-safe explanation — e.g., “Starvation means a process waits indefinitely because shorter jobs keep getting priority.”
- The LLM must NOT say “That’s exactly what I’m looking for” or hint at the rubric (guardrail enforced).
followUpCountis NOT incremented.
10.8 Guardrail Example
Section titled “10.8 Guardrail Example”Scenario: LLM attempts to reveal the rubric
Section titled “Scenario: LLM attempts to reveal the rubric”During Q1 follow-up, the LLM’s generated response includes:
“That’s a good point! You’ve actually covered the preemptive vs cooperative distinction, which is one of the key rubric items for this question.”
Runtime controller response:
- The LLM output is intercepted by the guardrail layer before TTS delivery.
- The guardrail checks against
forbidden: ["reveal_rubric"]. - The response is blocked — TTS does not play it to the candidate.
- A
guardrail_violationevent is emitted:
{
"type": "guardrail_violation",
"nodeId": "q1",
"rule": "reveal_rubric",
"severity": "blocked",
"originalText": "That's a good point! You've actually covered the preemptive vs cooperative distinction, which is one of the key rubric items for this question.",
"replacementAction": "regenerate_response",
"timestamp": "T+38.200s"
}
- The LLM is prompted to regenerate without rubric references.
- The regenerated response plays to the candidate.
Scenario: LLM attempts to transition to an unauthorised node
Section titled “Scenario: LLM attempts to transition to an unauthorised node”The LLM, after Q1, attempts to generate text that would skip Q2 and go to closing:
“Excellent work! Let’s wrap up the assessment.”
Runtime controller response:
- The runtime detects the LLM is attempting to exit Q1.
- Q1’s
transitionPolicy.allowedTargetsis["q2"]. - The LLM does not have authority to decide the transition — only the runtime
can approve
move_to_next_node. - The runtime blocks the premature closing and re-injects the Q2 stem:
{
"type": "guardrail_violation",
"nodeId": "q1",
"rule": "unauthorized_transition",
"severity": "blocked",
"originalIntent": "move_to_closing",
"allowedTargets": ["q2"],
"replacementAction": "inject_next_question",
"timestamp": "T+42.800s"
}
10.9 markRuntime Input Excerpt
Section titled “10.9 markRuntime Input Excerpt”After exam_completed fires, the marking runtime receives a structured input
package. Below is a simplified excerpt.
{
"inputVersion": "1.0.0",
"examId": "cs301-oral-2026s1-001",
"sessionId": "sess-7a3f",
"candidateId": "stu-202400042",
"examRuntimeVersion": "1.0.0",
"evidenceLedger": {
// ... full ledger as shown in §10.6 ...
},
"transcript": {
"totalSpans": 20,
"fullText": "…", // concatenated transcript with span IDs
"spans": [
// Each span: { spanId, nodeId, speaker, text, startTime, endTime }
]
},
"runtimeAudit": {
"nodesVisited": ["opening", "q1", "q2", "closing", "end"],
"totalDurationSeconds": 243.2,
"followUpsUsed": {
"q1": 2,
"q2": 1
},
"transitionDecisions": [
{
"nodeId": "q1",
"decision": "move_to_next_node",
"reason": "All expected evidence covered.",
"timestamp": "T+42.500s"
},
{
"nodeId": "q2",
"decision": "move_to_next_node",
"reason": "Time budget exceeded.",
"timestamp": "T+240.100s"
}
],
"candidateCommandsUsed": [
{ "nodeId": "q1", "command": "repeat", "timestamp": "T+21.500s" },
{ "nodeId": "q2", "command": "raise_hand", "timestamp": "T+120.400s" }
],
"guardrailViolations": []
},
"irSnapshot": {
// Frozen copy of the ExamRuntimeIR used for this session
// Enables marking to reference the exact rubric/evidence targets
// that were active during the exam
}
}
Key properties of the marking input:
- Evidence ledger is pre-populated with LLM confidence scores and transcript excerpts. The marking runtime may confirm, override, or supplement these.
- Transcript spans are linked to evidence targets, enabling human markers to verify the AI’s evidence assessment.
- Runtime audit provides context: why transitions happened, which commands the candidate used, and whether any guardrail violations occurred.
- IR snapshot freezes the assessment definition so that marking always references the exact rubric that was in effect during the exam.
10.10 Summary of Demonstrated Features
Section titled “10.10 Summary of Demonstrated Features”| Feature | Where Demonstrated |
|---|---|
| Node progress updates | node_progress events throughout §10.5 |
| Follow-up count runtime control | Q1 follow-ups 1→2, never exceeding maxFollowUps: 2 |
| Candidate repeat ≠ follow-up | T+21.500s candidate_command with costsFollowUp: false |
| Evidence signal → transcript span | evidence_signal with transcriptSpanId references |
| Transition decision with reason | transition_decision events at T+14.800s, T+36.700s, T+42.500s, T+240.100s |
exam_completed → marking runtime | §10.9 structured input package |
| Guardrail blocking rubric reveal | §10.8 first scenario |
| Guardrail blocking unauthorised transition | §10.8 second scenario |
| Time budget enforcement | Q2 time budget exceeded at T+240.000s |
10.11 INFOSYS110 — Full IOA Example (University of Auckland)
Section titled “10.11 INFOSYS110 — Full IOA Example (University of Auckland)”Based on the INFOSYS110 Interactive Oral Assessment Handbook. This example demonstrates a multi-segment, scenario-based IOA for a first-year Business Information Systems course. It exercises persona consistency, rubric-level nudging, transversal skills, scaffolding, and the
report_observationprotocol.
10.11.1 Scenario
Section titled “10.11.1 Scenario”| Dimension | Value |
|---|---|
| Course | INFOSYS 110 — Business Information Systems (Stage I) |
| Duration | 20 minutes (4 segments × ~5 min each) |
| Language | English |
| Assessment type | Interactive oral — scenario-based conversation across 4 segments |
| Examiner persona | Hotel General Manager (professionally-focused scenario) |
| Core case | A hotel chain considering digital transformation of its operations |
| Max follow-ups per segment | 3 |
| Candidate commands | repeat, clarify, slow_down, pause, help |
| Scaffolding | Enabled — 2-minute practice conversation before exam starts |
| Transversal skills | critical_thinking, professional_communication, problem_solving |
10.11.2 Flow Shape
Section titled “10.11.2 Flow Shape”┌────────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌──────────┐ ┌─────┐
│ SCAFFOLDING│──▶│ SEGMENT_1 │──▶│ SEGMENT_2 │──▶│ SEGMENT_3 │──▶│ SEGMENT_4 │──▶│ CLOSING │──▶│ END │
│ (practice) │ │ Digital │ │ IS Roles │ │ Data │ │ Change & │ │ │ │ │
│ │ │ Foundat. │ │ & BI │ │ Govern. │ │ Loyalty │ │ │ │ │
└────────────┘ └───────────┘ └───────────┘ └───────────┘ └───────────┘ └──────────┘ └─────┘
│ ≤3 FU │ ≤3 FU │ ≤3 FU │ ≤3 FU
10.11.3 Compiled Specification — Key Nodes
Section titled “10.11.3 Compiled Specification — Key Nodes”{
"irVersion": "exam-runtime-ir/0.1",
"examId": "infosys110-ioa-2026s1-001",
"examVersion": 1,
"metadata": {
"courseCode": "INFOSYS110",
"courseName": "Business Information Systems",
"institution": "University of Auckland",
"assessmentType": "interactive_oral",
"durationMinutes": 20,
"language": "en-NZ",
"communicationStyleIsLearningOutcome": false,
"equivalentWrittenWordCount": 3000
},
"transversalSkills": [
{
"skillId": "critical_thinking",
"description": "Analyses information, evaluates alternatives, forms reasoned judgements"
},
{
"skillId": "professional_communication",
"description": "Communicates ideas clearly in a professional context"
},
{
"skillId": "problem_solving",
"description": "Identifies problems and proposes practical solutions"
}
],
"scaffolding": {
"enabled": true,
"scenario": "A brief practice conversation to familiarise you with the assessment format. This does NOT count toward your score.",
"conversationPrompt": "Let's do a quick practice. Imagine you're telling a friend about a new app you've been using. What does it do and why do you like it?",
"maxDurationSeconds": 120,
"feedbackEnabled": true
},
"nodes": [
{
"nodeId": "segment_1_digital_foundations",
"type": "scenario_segment",
"persona": "You are the General Manager of a mid-range hotel chain in New Zealand. You are meeting with a junior team member to discuss the hotel's digital transformation strategy.",
"scenario": "The hotel chain is considering upgrading its booking system, implementing a mobile check-in app, and introducing IoT sensors for room management. You want to understand the candidate's grasp of digital foundations and operational trade-offs.",
"conversationPrompt": "Thanks for coming in. We're looking at some big technology changes for the chain. Can you walk me through what you think are the key digital foundations we need to get right before we invest in new systems?",
"evidenceSignals": [
{
"signalId": "ev-digital-transformation-understanding",
"description": "Demonstrates understanding of what digital transformation means in a business context",
"levels": ["basic_awareness", "applied_understanding", "strategic_insight"],
"weight": 0.25
},
{
"signalId": "ev-operational-trade-offs",
"description": "Identifies trade-offs in technology adoption (cost vs benefit, disruption vs efficiency)",
"levels": ["lists_factors", "analyses_trade_offs", "evaluates_with_evidence"],
"weight": 0.25
},
{
"signalId": "ev-infrastructure-awareness",
"description": "Recognises infrastructure prerequisites (network, integration, training)",
"levels": ["mentions_awareness", "explains_dependencies", "proposes_implementation_plan"],
"weight": 0.25
},
{
"signalId": "ev-customer-impact",
"description": "Considers how technology changes affect the customer experience",
"levels": ["acknowledges_impact", "analyses_customer_journey", "proposes_customer_centric_approach"],
"weight": 0.25
}
],
"maxFollowUps": 3,
"timeBudgetSeconds": 300,
"followUpBank": [
{
"followUpId": "s1-fu1",
"type": "nudge",
"prompt": "That's a good overview. Can you tell me more about why you think those specific foundations matter — what could go wrong if we skip them?",
"triggerCondition": "basic_awareness_level AND missing_trade_offs"
},
{
"followUpId": "s1-fu2",
"type": "probe",
"prompt": "How would you prioritise these? If we could only do one thing first, what would it be and why?",
"triggerCondition": "lists_factors BUT no prioritisation"
},
{
"followUpId": "s1-fu3",
"type": "challenge",
"prompt": "Some staff might resist these changes. How does that factor into your thinking?",
"triggerCondition": "missing_change_management_awareness"
}
],
"transitionConditions": [
{
"id": "s1-sufficient",
"expression": "signal_count >= 3 AND any_signal_level >= 'applied_understanding'"
},
{
"id": "s1-time-exhausted",
"expression": "time_budget_exceeded"
},
{
"id": "s1-followups-exhausted",
"expression": "follow_up_count >= 3"
}
],
"guardrails": {
"forbidden": ["reveal_rubric", "reveal_score", "suggest_answer", "mention_other_segments"],
"personaBreakPatterns": ["As your examiner", "In this assessment", "Let me ask you another question"]
}
},
{
"nodeId": "segment_2_is_roles_bi",
"type": "scenario_segment",
"persona": "You are the General Manager continuing the meeting. Now focusing on information systems roles and how business intelligence can support decision-making.",
"scenario": "After discussing digital foundations, you want to explore how different IS roles (database admin, systems analyst, CIO) contribute to the hotel's success, and how business intelligence dashboards could help manage operations.",
"conversationPrompt": "Good. Now, we're also thinking about building a business intelligence dashboard for our regional managers. Can you explain what roles in an IS team would be involved in making that happen, and what kind of insights the dashboard should provide?",
"evidenceSignals": [
{
"signalId": "ev-is-roles-knowledge",
"description": "Identifies and explains key IS roles (DBA, systems analyst, CIO, etc.)",
"levels": ["names_roles", "explains_responsibilities", "maps_roles_to_outcomes"],
"weight": 0.3
},
{
"signalId": "ev-bi-understanding",
"description": "Demonstrates understanding of business intelligence concepts",
"levels": ["defines_bi", "explains_bi_value", "proposes_bi_use_case"],
"weight": 0.35
},
{
"signalId": "ev-data-driven-decision",
"description": "Connects data/information to business decision-making",
"levels": ["mentions_data", "explains_decision_process", "proposes_metrics_framework"],
"weight": 0.35
}
],
"maxFollowUps": 3,
"timeBudgetSeconds": 300,
"followUpBank": [
{
"followUpId": "s2-fu1",
"type": "nudge",
"prompt": "You've mentioned the roles. How would these people work together day-to-day on the dashboard project?",
"triggerCondition": "names_roles BUT no collaboration_explanation"
},
{
"followUpId": "s2-fu2",
"type": "probe",
"prompt": "What specific metrics would a regional manager find most useful on that dashboard?",
"triggerCondition": "defines_bi BUT no specific_metrics"
},
{
"followUpId": "s2-fu3",
"type": "challenge",
"prompt": "What if the data in the dashboard is wrong or outdated? How does that affect decision-making?",
"triggerCondition": "missing_data_quality_awareness"
}
],
"transitionConditions": [
{
"id": "s2-sufficient",
"expression": "signal_count >= 2 AND has_signal('ev-bi-understanding') AND any_signal_level >= 'explains_bi_value'"
}
]
},
{
"nodeId": "segment_3_data_governance",
"type": "scenario_segment",
"persona": "You are the General Manager. The conversation has turned to the responsibilities that come with collecting and analysing guest data.",
"scenario": "The hotel collects guest preferences, booking patterns, and feedback data. You need to understand the candidate's awareness of data governance, privacy, and responsible analytics practices.",
"conversationPrompt": "One more thing — we collect a lot of guest data for the loyalty programme and the BI dashboard. What should we be thinking about in terms of data governance and responsible use of that data?",
"evidenceSignals": [
{
"signalId": "ev-data-governance-awareness",
"description": "Demonstrates understanding of data governance principles",
"levels": ["mentions_governance", "explains_frameworks", "proposes_governance_policy"],
"weight": 0.3
},
{
"signalId": "ev-privacy-ethics",
"description": "Considers privacy, consent, and ethical use of data",
"levels": ["mentions_privacy", "explains_consent_model", "proposes_ethical_framework"],
"weight": 0.35
},
{
"signalId": "ev-responsible-analytics",
"description": "Understands responsible analytics practices (bias, transparency, accountability)",
"levels": ["mentions_awareness", "explains_risks", "proposes_mitigation_strategy"],
"weight": 0.35
}
],
"maxFollowUps": 3,
"timeBudgetSeconds": 300,
"followUpBank": [
{
"followUpId": "s3-fu1",
"type": "nudge",
"prompt": "That's an important point. Can you tell me more about how we'd actually implement that in practice?",
"triggerCondition": "mentions_privacy BUT no implementation_detail"
},
{
"followUpId": "s3-fu2",
"type": "probe",
"prompt": "What about the ethical side — are there things we could do with the data that we probably shouldn't?",
"triggerCondition": "missing_ethical_consideration"
},
{
"followUpId": "s3-fu3",
"type": "challenge",
"prompt": "If a guest asked to see all the data we hold about them, could we do it? What would that involve?",
"triggerCondition": "missing_data_subject_rights"
}
]
},
{
"nodeId": "segment_4_change_loyalty",
"type": "scenario_segment",
"persona": "You are the General Manager. Final segment covering project change management and the loyalty programme logic.",
"scenario": "The hotel is rolling out a new loyalty programme alongside the technology changes. You want to assess the candidate's understanding of change management, human factors, and how loyalty programme logic works.",
"conversationPrompt": "Last topic. We're launching a new loyalty programme alongside all these tech changes. How would you approach the change management side of this, and can you walk me through how you think the loyalty programme logic should work?",
"evidenceSignals": [
{
"signalId": "ev-change-management",
"description": "Demonstrates understanding of change management principles in IT projects",
"levels": ["mentions_change", "explains_approach", "proposes_change_plan"],
"weight": 0.25
},
{
"signalId": "ev-human-factors",
"description": "Considers human factors (training, resistance, user adoption)",
"levels": ["mentions_people", "analyses_barriers", "proposes_adoption_strategy"],
"weight": 0.25
},
{
"signalId": "ev-loyalty-programme-logic",
"description": "Explains loyalty programme mechanics (points, tiers, rewards, data capture)",
"levels": ["describes_basics", "explains_logic", "proposes_optimisation"],
"weight": 0.25
},
{
"signalId": "ev-integration-thinking",
"description": "Connects loyalty programme to broader digital strategy",
"levels": ["mentions_connection", "explains_integration", "proposes_synergies"],
"weight": 0.25
}
],
"maxFollowUps": 3,
"timeBudgetSeconds": 300,
"followUpBank": [
{
"followUpId": "s4-fu1",
"type": "nudge",
"prompt": "Good thinking. How would you handle staff who are resistant to the new system?",
"triggerCondition": "mentions_change BUT no resistance_handling"
},
{
"followUpId": "s4-fu2",
"type": "probe",
"prompt": "Walk me through how a guest would earn and redeem points in this programme.",
"triggerCondition": "describes_basics BUT no detailed_logic"
},
{
"followUpId": "s4-fu3",
"type": "challenge",
"prompt": "How does the loyalty programme data feed back into the BI dashboard we discussed earlier?",
"triggerCondition": "missing_integration_with_segment_2"
}
]
},
{
"nodeId": "closing",
"type": "closing",
"persona": "You are the General Manager wrapping up the meeting.",
"conversationPrompt": "That's great, thank you for your time today. You've given me a lot to think about. We'll be in touch with feedback soon.",
"timeBudgetSeconds": 30,
"transitions": [{ "target": "end", "condition": "always" }]
},
{
"nodeId": "end",
"type": "end"
}
],
"transitions": [
{ "from": "scaffolding", "to": "segment_1_digital_foundations", "condition": "always" },
{ "from": "segment_1_digital_foundations", "to": "segment_2_is_roles_bi", "condition": "node_complete" },
{ "from": "segment_2_is_roles_bi", "to": "segment_3_data_governance", "condition": "node_complete" },
{ "from": "segment_3_data_governance", "to": "segment_4_change_loyalty", "condition": "node_complete" },
{ "from": "segment_4_change_loyalty", "to": "closing", "condition": "node_complete" },
{ "from": "closing", "to": "end", "condition": "always" }
]
}
10.11.4 Compiled Pipecat NodeConfig (Segment 1)
Section titled “10.11.4 Compiled Pipecat NodeConfig (Segment 1)”# Compiled from specification node: segment_1_digital_foundations
segment_1_config = {
"name": "segment_1_digital_foundations",
"role_message": (
"You are the General Manager of a mid-range hotel chain in New Zealand. "
"You are meeting with a junior team member to discuss the hotel's digital "
"transformation strategy. Be professional but approachable. Ask follow-up "
"questions naturally, as a manager would in a real meeting."
),
"task_messages": [
{
"role": "developer",
"content": (
"SCENARIO: The hotel chain is considering upgrading its booking system, "
"implementing a mobile check-in app, and introducing IoT sensors for room "
"management. You want to understand the candidate's grasp of digital "
"foundations and operational trade-offs.\n\n"
"OPENING: Thanks for coming in. We're looking at some big technology "
"changes for the chain. Can you walk me through what you think are the "
"key digital foundations we need to get right before we invest in new systems?\n\n"
"EVIDENCE TO LISTEN FOR:\n"
"- digital_transformation_understanding: Demonstrates understanding of what "
"digital transformation means in a business context (levels: basic_awareness, "
"applied_understanding, strategic_insight)\n"
"- operational_trade_offs: Identifies trade-offs in technology adoption "
"(levels: lists_factors, analyses_trade_offs, evaluates_with_evidence)\n"
"- infrastructure_awareness: Recognises infrastructure prerequisites "
"(levels: mentions_awareness, explains_dependencies, proposes_implementation_plan)\n"
"- customer_impact: Considers how technology changes affect the customer "
"experience (levels: acknowledges_impact, analyses_customer_journey, "
"proposes_customer_centric_approach)\n\n"
"TRANSVERSAL SKILLS TO OBSERVE:\n"
"- critical_thinking: Does the candidate analyse information and form reasoned judgements?\n"
"- professional_communication: Does the candidate communicate clearly in a professional context?\n"
"- problem_solving: Does the candidate identify problems and propose practical solutions?\n\n"
"CONSTRAINTS:\n"
"- Maximum 3 follow-up questions\n"
"- Time budget: 300 seconds\n"
"- NEVER: reveal rubric, reveal score, suggest answer, mention other segments\n"
"- If the candidate demonstrates basic_awareness, nudge toward applied_understanding\n\n"
"After every candidate response, call report_observation with your "
"assessment of the response, any evidence signals you detected, "
"and what you want to say next."
)
}
],
"functions": [report_observation], # The ONE function
"pre_actions": [
{"type": "function", "handler": emit_node_entered},
{"type": "tts_say", "text": None} # Opening is in task_messages
],
"post_actions": [
{"type": "function", "handler": finalize_node}
],
"context_strategy": ContextStrategyConfig(strategy=ContextStrategy.RESET),
}
10.11.5 Runtime Execution Trace (Segment 1, excerpt)
Section titled “10.11.5 Runtime Execution Trace (Segment 1, excerpt)”T+00.000s exam_state → in_progress
T+00.100s node_entered: segment_1_digital_foundations
T+00.200s LLM generates opening via report_observation(spokenText="Thanks for coming in...")
T+00.300s TTS speaks opening
T+15.000s Candidate responds: "I think the main thing is getting the network infrastructure
right, and making sure the booking system can talk to the other systems..."
T+15.100s STT transcript segment → Runtime Controller
T+16.000s LLM calls report_observation({
signals: [
{signalType: "digital_transformation_understanding", rubricLevel: "basic_awareness",
excerpt: "getting the network infrastructure right", confidence: 0.7},
{signalType: "infrastructure_awareness", rubricLevel: "mentions_awareness",
excerpt: "booking system can talk to the other systems", confidence: 0.8}
],
answerQuality: "substantive",
needsFollowUp: true, followUpType: "nudge",
evidenceSufficient: false,
anxietyDetected: false,
spokenText: "That's a good starting point. Can you tell me more about why
those specific foundations matter — what could go wrong if we skip them?"
})
T+16.100s Runtime Controller:
- Validates spokenText → PASS (no rubric leak, in persona, on topic)
- Writes 2 evidence signals to ledger
- Checks guardrails: followUpCount=0 < maxFollowUps=3 → ALLOW follow-up
- followUpCount → 1
- Returns {status: "followup_injected"}
T+16.200s TTS speaks follow-up
T+35.000s Candidate responds: "If we skip the infrastructure, the systems won't integrate
properly. And from the customer side, if the app is slow or crashes during
check-in, that's a terrible first impression..."
T+36.000s LLM calls report_observation({
signals: [
{signalType: "operational_trade_offs", rubricLevel: "analyses_trade_offs",
excerpt: "systems won't integrate properly", confidence: 0.85},
{signalType: "customer_impact", rubricLevel: "analyses_customer_journey",
excerpt: "app is slow or crashes during check-in", confidence: 0.9}
],
answerQuality: "substantive",
needsFollowUp: false,
evidenceSufficient: true,
anxietyDetected: false,
spokenText: "Excellent point about the customer experience. That's exactly
the kind of thinking we need. Let's move on to the next topic."
})
T+36.100s Runtime Controller:
- Validates spokenText → PASS
- Writes 2 more evidence signals (total: 4)
- Checks: evidenceSufficient=true AND signalTypes cover 4/4 targets → TRANSITION
- Builds segment_2 config → flow_manager.set_node_from_config(segment_2_config)
T+36.200s node_exit: segment_1_digital_foundations (reason: evidence_sufficient)
T+36.300s node_entered: segment_2_is_roles_bi
10.11.6 Evidence Ledger Output (Segment 1)
Section titled “10.11.6 Evidence Ledger Output (Segment 1)”{
"nodeId": "segment_1_digital_foundations",
"completionStatus": "completed",
"durationSeconds": 36,
"followUpsUsed": 1,
"signals": [
{
"signalId": "sig-001",
"signalType": "digital_transformation_understanding",
"rubricLevel": "basic_awareness",
"excerpt": "getting the network infrastructure right",
"confidence": 0.7,
"turnId": "turn-002",
"timestamp": "2026-05-06T02:10:15Z",
"transversalSkills": ["critical_thinking"]
},
{
"signalId": "sig-002",
"signalType": "infrastructure_awareness",
"rubricLevel": "mentions_awareness",
"excerpt": "booking system can talk to the other systems",
"confidence": 0.8,
"turnId": "turn-002",
"timestamp": "2026-05-06T02:10:15Z",
"transversalSkills": ["problem_solving"]
},
{
"signalId": "sig-003",
"signalType": "operational_trade_offs",
"rubricLevel": "analyses_trade_offs",
"excerpt": "systems won't integrate properly",
"confidence": 0.85,
"turnId": "turn-004",
"timestamp": "2026-05-06T02:10:35Z",
"transversalSkills": ["critical_thinking"]
},
{
"signalId": "sig-004",
"signalType": "customer_impact",
"rubricLevel": "analyses_customer_journey",
"excerpt": "app is slow or crashes during check-in",
"confidence": 0.9,
"turnId": "turn-004",
"timestamp": "2026-05-06T02:10:35Z",
"transversalSkills": ["critical_thinking", "problem_solving"]
}
]
}
10.11.7 Key Differences from CS301 Example
Section titled “10.11.7 Key Differences from CS301 Example”| Dimension | CS301 (§10.1–10.10) | INFOSYS110 (§10.11) |
|---|---|---|
| Format | 2 questions with follow-ups | 4 scenario segments in one conversation |
| Persona | Neutral examiner | Hotel General Manager (role-play) |
| Scenario | No scenario | Full professional scenario with context |
| Rubric levels | Binary (present/absent) | Multi-level (basic → applied → strategic) |
| Transversal skills | Not tracked | Tracked across all segments |
| Scaffolding | Not included | 2-minute practice conversation |
| Follow-up strategy | Probe/redirect | Probe/redirect/scaffold/challenge/nudge |
| Conversation style | Q&A | Free-flowing professional dialogue |
| Equity | Not addressed | communicationStyleIsLearningOutcome: false |
| Context strategy | Not specified | RESET between segments |
| Time budget | Per question | Per segment (5 min each) |
10.11.8 Scaffolding Trace
Section titled “10.11.8 Scaffolding Trace”T-120.000s exam_state → scaffolding
T-120.100s node_entered: scaffolding
T-120.200s LLM: "Let's do a quick practice. Imagine you're telling a friend about
a new app you've been using. What does it do and why do you like it?"
T-100.000s Candidate: "Well, I use this app called Notion for organising my notes..."
T-080.000s LLM: "Great, that's perfect. You're speaking clearly and giving good
detail. In the real assessment, just keep doing what you're doing.
We'll start now."
T-080.100s exam_state → in_progress
T-080.200s node_entered: segment_1_digital_foundations
Note: Scaffolding transcript is NOT included in the MarkingPackage. It exists only for candidate familiarisation and QA purposes. |
raise_handpauses timer | T+120.400s pause, T+130.400s resume |
10.12 Joughin Dimension Coverage Map
Section titled “10.12 Joughin Dimension Coverage Map”Joughin (1998) identifies six dimensions of oral assessment. The following table maps the worked examples against these dimensions to demonstrate that the specification can represent the full oral assessment design space.
| Dimension | Range | CS301 (§10.1) | INFOSYS110 (§10.11) | Viva Voce (§10.13) | OSCE (§10.14) | ConVOE (§10.15) |
|---|---|---|---|---|---|---|
| 1. Content Type | Knowledge; Applied Problem Solving; Interpersonal; Intrapersonal | Knowledge | Knowledge + Applied | Applied + Knowledge | Interpersonal + Applied | Knowledge |
| 2. Interaction | Presentation ↔ Dialogue | Structured Q&A | Scenario dialogue | Defence dialogue | Station-based dialogue | Presentation (recorded) |
| 3. Authenticity | Contextualised ↔ Decontextualised | Decontextualised | Semi-contextualised | Semi-contextualised | Authentic (clinical) | Decontextualised |
| 4. Structure | Closed ↔ Open | Closed | Moderately closed | Moderately open | Closed (timed stations) | Fully closed |
| 5. Examiners | Self; Peer; Authority | Single authority | Single authority (role-play) | Single authority | Authority + SP | AI examiner (automated) |
| 6. Orality | Purely oral ↔ Secondary | Purely oral | Purely oral | Oral secondary (defends written work) | Oral + physical demo | Purely oral (recorded) |
This coverage map demonstrates that the specification’s node-graph architecture, policy system, and evidence model can represent assessments spanning all six of Joughin’s dimensions.
10.13 Viva Voce Example — Oral Defence of Written Work
Section titled “10.13 Viva Voce Example — Oral Defence of Written Work”Modality: Oral defence of a prior written submission. Joughin dimension coverage: “Orality as secondary” — the oral component supplements a written artifact. This is the modality Akimov & Malin (2020) implemented: students defended a written bond analysis project orally. Also exercises the “applied problem solving” content type and “moderately open” structure.
10.13.1 Scenario
Section titled “10.13.1 Scenario”| Dimension | Value |
|---|---|
| Course | RES501 — Research Methods (Postgraduate) |
| Duration | 20 minutes |
| Language | English |
| Assessment type | Viva voce — oral defence of a written research proposal |
| Prior work | Candidate submits a 3,000-word research proposal 1 week before the exam |
| Examiner persona | Academic supervisor — supportive but rigorous |
| Max follow-ups per section | 3 |
| Orality role | Secondary — oral component supplements the written proposal |
10.13.2 Flow Shape
Section titled “10.13.2 Flow Shape”┌──────────┐ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌──────────┐ ┌─────┐
│ OPENING │──▶│ METHODOLOGY │──▶│ LIT REVIEW │──▶│ FEASIBILITY │──▶│ CLOSING │──▶│ END │
│ │ │ DEFENCE │ │ DEFENCE │ │ & ETHICS │ │ │ │ │
└──────────┘ └───────────────┘ └───────────────┘ └───────────────┘ └──────────┘ └─────┘
│ ≤3 FU │ ≤3 FU │ ≤3 FU
│ references │ references │ references
│ prior_work │ prior_work │ prior_work
10.13.3 Key Specification Features
Section titled “10.13.3 Key Specification Features”{
"irVersion": "exam-runtime-ir/0.1",
"examId": "res501-viva-2026s1-001",
"metadata": {
"courseCode": "RES501",
"assessmentType": "viva_voce",
"durationMinutes": 20,
"oralityRole": "secondary",
"priorWorkRequired": true,
"assessmentProfile": {
"interactionMode": "structured_dialogue",
"contentTypes": ["applied_problem_solving", "knowledge_understanding"],
"structureLevel": "semi-structured",
"authenticityLevel": "simulated",
"assessmentPurpose": "summative"
}
},
"priorWork": {
"artifactId": "research-proposal-2026",
"type": "written_paper",
"title": "Research Proposal: Impact of AI on Assessment Design",
"submissionDeadline": "2026-05-20T23:59:00Z",
"maxWordCount": 3000,
"availableToExaminer": true
},
"nodes": [
{
"nodeId": "methodology_defence",
"type": "question",
"questionStem": "Your proposal uses a mixed-methods design. Can you walk me through why you chose this approach over a purely quantitative or purely qualitative study?",
"maxFollowUps": 3,
"timeBudgetSeconds": 420,
"evidenceTargets": [
{
"id": "ev-methodology-rationale",
"description": "Articulates a clear rationale for mixed-methods design",
"rubric": "References research questions, discusses complementarity of methods",
"level": "required"
},
{
"id": "ev-methodology-alternatives",
"description": "Demonstrates awareness of alternative methodological approaches",
"rubric": "Names at least one alternative and explains why it was not chosen",
"level": "expected"
},
{
"id": "ev-methodology-limitations",
"description": "Acknowledges limitations of chosen approach",
"rubric": "Identifies at least one limitation and discusses mitigation",
"level": "expected"
}
],
"followUpBank": [
{
"followUpId": "meth-fu1",
"type": "probe",
"prompt": "In your proposal, you mention using thematic analysis for the qualitative data. Can you explain why thematic analysis rather than, say, grounded theory?",
"referencesPriorWork": true
},
{
"followUpId": "meth-fu2",
"type": "challenge",
"prompt": "A reviewer might argue that your sample size of 15 interviews is too small for meaningful qualitative analysis. How would you respond?",
"referencesPriorWork": true
}
],
"guardrails": {
"forbidden": ["reveal_rubric", "suggest_answer"],
"mustReferencePriorWork": true
}
}
// ... additional nodes for lit review defence, feasibility & ethics ...
]
}
10.13.4 Key Differences from CS301/INFOSYS110
Section titled “10.13.4 Key Differences from CS301/INFOSYS110”| Feature | CS301 / INFOSYS110 | Viva Voce |
|---|---|---|
| Prior work | None | Written proposal submitted before exam |
| Orality role | Purely oral | Oral secondary (defends written work) |
| Follow-up references | Based on candidate’s spoken answer | References specific sections of the written proposal |
| Evidence targets | Assessed from speech alone | Assessed from speech + written work alignment |
| Structure | Closed / moderately closed | Moderately open (examiner probes reasoning behind choices) |
| Content type | Knowledge + applied | Applied problem solving (research design justification) |
10.13.5 Evidence Ledger Differences
Section titled “10.13.5 Evidence Ledger Differences”The evidence ledger for a viva voce includes an additional priorWorkReference
field linking evidence signals to specific sections of the written submission:
{
"evidenceTargetId": "ev-methodology-rationale",
"nodeId": "methodology_defence",
"signal": "covered",
"confidence": 0.88,
"transcriptSpanIds": ["sp-015"],
"priorWorkReference": {
"section": "3.2 Research Design",
"excerpt": "A mixed-methods approach is adopted to triangulate findings...",
"alignment": "candidate's oral explanation consistent with written rationale"
},
"rationale": "Candidate articulated rationale that aligns with §3.2 of their proposal."
}
10.14 OSCE Station Example — Clinical Assessment
Section titled “10.14 OSCE Station Example — Clinical Assessment”Modality: Objective Structured Clinical Examination station. Joughin dimension coverage: “Authenticity” at the authentic pole, “interpersonal competence” content type, “orality as secondary” (oral + physical demonstration). The versioning document (§09) references OSCE packages; this example demonstrates the full worked instantiation.
10.14.1 Scenario
Section titled “10.14.1 Scenario”| Dimension | Value |
|---|---|
| Course | MED302 — Clinical Skills (Year 3 Medicine) |
| Duration | 8 minutes per station |
| Language | English |
| Assessment type | OSCE station — patient history-taking + clinical reasoning |
| Examiner persona | Standardised Patient (SP) playing a 45-year-old with chest pain |
| Max follow-ups | 2 (time-constrained station) |
| Orality role | Secondary — oral interaction + physical examination demonstration |
| Professional body | Mapped to AMC (Australian Medical Council) clinical competencies |
10.14.2 Flow Shape
Section titled “10.14.2 Flow Shape”┌───────────┐ ┌───────────────────┐ ┌───────────────────┐ ┌──────────┐ ┌─────┐
│ STATION │──▶│ HISTORY-TAKING │──▶│ CLINICAL │──▶│ CLOSING │──▶│ END │
│ BRIEFING │ │ (5 min) │ │ REASONING (3 min) │ │ │ │ │
└───────────┘ └───────────────────┘ └───────────────────┘ └──────────┘ └─────┘
│ ≤2 FU │ ≤2 FU
10.14.3 Key Specification Features
Section titled “10.14.3 Key Specification Features”{
"irVersion": "exam-runtime-ir/0.1",
"examId": "med302-osce-station2-2026s1-001",
"metadata": {
"courseCode": "MED302",
"assessmentType": "osce_station",
"durationMinutes": 8,
"stationNumber": 2,
"clinicalDomain": "cardiology",
"accreditationMapping": ["AMC-12.1", "AMC-12.3", "AMC-14.2"],
"assessmentProfile": {
"interactionMode": "structured_dialogue",
"contentTypes": ["interpersonal_competence", "applied_problem_solving"],
"structureLevel": "closed",
"authenticityLevel": "authentic",
"assessmentPurpose": "summative"
}
},
"standardisedPatient": {
"persona": "You are a 45-year-old office worker presenting to the emergency department with chest pain that started 2 hours ago. Describe your pain as dull, central, radiating to your left arm. You are anxious but cooperative. Answer questions accurately but do not volunteer information unless asked.",
"trainingLevel": "certified_SP",
"consistencyScript": true
},
"nodes": [
{
"nodeId": "history_taking",
"type": "scenario_segment",
"persona": "[Standardised Patient persona from above]",
"conversationPrompt": "Good morning. Can you tell me what's brought you in today?",
"maxFollowUps": 2,
"timeBudgetSeconds": 300,
"evidenceTargets": [
{
"id": "ev-history-chief-complaint",
"description": "Elicits the chief complaint and characterises the pain (OPQRST)",
"rubric": "Asks about onset, provocation, quality, radiation, severity, timing",
"level": "required",
"competencyMapping": "AMC-12.1"
},
{
"id": "ev-history-risk-factors",
"description": "Assesses cardiovascular risk factors",
"rubric": "Asks about smoking, family history, hypertension, diabetes, cholesterol",
"level": "required",
"competencyMapping": "AMC-12.1"
},
{
"id": "ev-history-differential",
"description": "Considers differential diagnoses during history-taking",
"rubric": "Asks questions that help distinguish cardiac from non-cardiac causes",
"level": "expected",
"competencyMapping": "AMC-12.3"
},
{
"id": "ev-communication-compassion",
"description": "Demonstrates compassionate, patient-centred communication",
"rubric": "Uses open questions, active listening, acknowledges patient concerns, explains next steps",
"level": "required",
"competencyMapping": "AMC-14.2",
"evidenceDimension": "interpersonal_competence"
}
],
"followUpBank": [
{
"followUpId": "hist-fu1",
"type": "probe",
"prompt": "You haven't asked about my family history yet. Is there anything else you'd like to know?",
"triggerCondition": "missing_risk_factors"
}
]
},
{
"nodeId": "clinical_reasoning",
"type": "question",
"questionStem": "Based on the history you've taken, what are your top three differential diagnoses and how would you prioritise your investigations?",
"maxFollowUps": 2,
"timeBudgetSeconds": 180,
"evidenceTargets": [
{
"id": "ev-differential-diagnosis",
"description": "Generates appropriate differential diagnoses",
"rubric": "Includes acute coronary syndrome, considers PE, aortic dissection, musculoskeletal",
"level": "required",
"competencyMapping": "AMC-12.3"
},
{
"id": "ev-investigation-plan",
"description": "Proposes a rational investigation plan",
"rubric": "ECG, troponin, CXR as first-line; considers risk stratification",
"level": "required",
"competencyMapping": "AMC-12.3"
}
]
}
]
}
10.14.4 Key Differences from Other Examples
Section titled “10.14.4 Key Differences from Other Examples”| Feature | CS301 / INFOSYS110 | OSCE Station |
|---|---|---|
| Authenticity | Decontextualised / simulated | Authentic (clinical setting) |
| Content type | Knowledge / applied | Interpersonal competence + applied |
| Persona | Examiner / manager | Standardised Patient (trained actor) |
| Time constraint | Flexible (10–20 min) | Strict station time (8 min) |
| Competency mapping | Learning outcomes | Professional body accreditation standards |
| Communication as evidence | Optional transversal skill | Required evidence target (AMC-14.2) |
| Orality | Purely oral | Oral + physical demonstration |
10.15 ConVOE Example — Concurrent Video-Based Oral Exam
Section titled “10.15 ConVOE Example — Concurrent Video-Based Oral Exam”Modality: Concurrent Video-Based Oral Exam (ConVOE), as described by Bayley et al. (2024). All students simultaneously record video responses to questions via an LMS. This is the “presentation” pole of Joughin’s interaction dimension — one-way delivery with no real-time dialogue.
Key difference: The specification’s real-time dialogue model is adapted to support a “recorded response” mode where the candidate records answers without live examiner interaction.
10.15.1 Scenario
Section titled “10.15.1 Scenario”| Dimension | Value |
|---|---|
| Course | BUS201 — Business Analytics (Year 2, 600+ students) |
| Duration | 20 minutes (4 questions × 5 min max each, no backtracking) |
| Language | English |
| Assessment type | ConVOE — recorded video responses, no live dialogue |
| Interaction mode | Presentation (one-way, no follow-ups) |
| Platform | LMS with video recording integration |
| Cohort size | 620 students, concurrent administration |
| Grading | Parallel evaluation (all students graded on Q1 before Q2) |
10.15.2 Flow Shape
Section titled “10.15.2 Flow Shape”┌──────────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────────┐ ┌─────┐
│ BRIEFING │──▶│ Q1 │──▶│ Q2 │──▶│ Q3 │──▶│ Q4 │──▶│ CLOSING │──▶│ END │
│ + PRACTICE│ │ (5m) │ │ (5m) │ │ (5m) │ │ (5m) │ │ │ │ │
└──────────┘ └──────┘ └──────┘ └──────┘ └──────┘ └──────────┘ └─────┘
no FU no FU no FU no FU
no backtrack no backtrack no backtrack no backtrack
10.15.3 Key Specification Features
Section titled “10.15.3 Key Specification Features”{
"irVersion": "exam-runtime-ir/0.1",
"examId": "bus201-convoe-2026s1-001",
"metadata": {
"courseCode": "BUS201",
"assessmentType": "convoe",
"durationMinutes": 20,
"language": "en-CA",
"expectedCandidateCount": 620,
"concurrentAdministration": true,
"assessmentProfile": {
"interactionMode": "presentation",
"contentTypes": ["knowledge_understanding", "applied_problem_solving"],
"structureLevel": "closed",
"authenticityLevel": "decontextualised",
"assessmentPurpose": "summative"
}
},
"administration": {
"mode": "recorded_response",
"backtrackingAllowed": false,
"recordingFormat": "video",
"maxResponseTimeSec": 300,
"thinkingTimeSec": 30,
"practiceQuestionEnabled": true,
"questionRotation": {
"enabled": true,
"poolSizePerSlot": 5,
"antiCollusionWindow": "same_day"
}
},
"cohort": {
"cohortId": "bus201-2026s1-cohort",
"administrationWindow": {
"startAt": "2026-06-01T09:00:00Z",
"endAt": "2026-06-01T11:00:00Z"
},
"concurrent": true,
"gradingStrategy": "parallel_evaluation"
},
"nodes": [
{
"nodeId": "briefing",
"type": "opening",
"prompt": "Welcome to your BUS201 oral assessment. You will answer 4 questions. For each question, you have up to 5 minutes to record your video response. You cannot go back to previous questions. A practice question is available before you begin."
},
{
"nodeId": "q1",
"type": "question",
"questionStem": "Explain the difference between supervised and unsupervised machine learning. Give a business example of each.",
"maxFollowUps": 0,
"timeBudgetSeconds": 300,
"recordingRequired": true,
"evidenceTargets": [
{
"id": "ev-ml-types",
"description": "Distinguishes supervised from unsupervised learning",
"level": "required"
},
{
"id": "ev-ml-examples",
"description": "Provides valid business examples for both types",
"level": "required"
}
],
"questionPool": {
"poolId": "q1-pool",
"variants": [
{ "variantId": "q1-v1", "prompt": "Explain the difference between supervised and unsupervised machine learning. Give a business example of each." },
{ "variantId": "q1-v2", "prompt": "Compare classification and clustering algorithms. When would a business use each approach?" },
{ "variantId": "q1-v3", "prompt": "What is the role of labelled data in machine learning? Provide business scenarios where labelled data is available versus unavailable." }
],
"drawCount": 1
}
}
// ... additional question nodes with pools ...
]
}
10.15.4 Key Differences from Dialogue-Based Examples
Section titled “10.15.4 Key Differences from Dialogue-Based Examples”| Feature | CS301 / INFOSYS110 (Dialogue) | ConVOE (Presentation) |
|---|---|---|
| Interaction mode | Dialogue (multi-turn) | Presentation (one-way recording) |
| Follow-ups | Allowed (1–3 per node) | None (maxFollowUps: 0) |
| Candidate commands | repeat, clarify, pause | None (no live examiner) |
| Scalability | 1 candidate per session | 620 candidates concurrent |
| Question pools | Fixed questions | Randomised from pool (anti-collusion) |
| Grading | Sequential per session | Parallel evaluation (by question) |
| Backtracking | Not applicable | Explicitly forbidden |
| Practice session | Optional scaffolding | Built-in practice question |
| Reliability concern | Inter-case (follow-up variance) | Inter-case (question difficulty equivalence) |
| Academic integrity | Conversation fingerprint | Question rotation + time limit + video recording |
10.15.5 Scalability Considerations
Section titled “10.15.5 Scalability Considerations”The ConVOE format exercises the specification’s scalability features:
- Question pools (
questionPool): Each question slot draws from a pool of equivalent variants, mitigating question-sharing (Bayley et al., 2024, p. 165: “students posted ConVOE questions to an online group chat”). - Cohort management: The
cohortentity groups 620 concurrent sessions and enables batch grading. - Parallel evaluation: The
gradingStrategy: "parallel_evaluation"ensures graders assess all candidates on Q1 before moving to Q2, maintaining consistency (Bayley et al., 2024, p. 163). - No follow-ups: The presentation interaction mode eliminates inter-case reliability concerns from dialogue variance.
Revision History
Section titled “Revision History”| Version | Date | Changes |
|---|---|---|
| v0.2.0 | 2026-06-30 | Updated examples to reflect IOA-ORM terminology and new schema fields. |
| v0.1.0 | 2026-05-06 | Initial release. |