Skip to content

Event Schema

Canonical contract for the Model Lens event bus. Every event emitted in the system conforms to one of these types.

Provider calls → EventBus.emit_sync() → Consumers
├── SSE bridge (dashboard)
├── Replay writer (disk)
├── Trace capture
└── Future: alerts, MCP, metrics engine

The event bus is thread-safe (threading.Lock on subscription/mutation, handler snapshots under lock). Events are delivered synchronously via emit_sync().


Emitted for every streaming token from a provider response.

FieldTypeRequiredDescription
modelstrModel identifier (e.g. "qwen3.5-9b")
tokenstrRaw token text
indexint0-based token index within this completion
timing_msfloatMilliseconds since start of completion
providerstrProvider name
run_idstrCorrelated benchmark run ID
sourcestrOriginating component
idstrAuto-generated event ID

Source: OpenAICompatibleProvider.chat_completion() streaming path.
Consumers: TraceCapture, EventBusSSEServer, EventBusReplayWriter.

Emitted after a full provider response (success or failure).

FieldTypeRequiredDescription
modelstrModel identifier
responsestrFull response text (empty on failure)
tokens_usedintTotal tokens (prompt + completion)
latency_msfloatTotal generation time
ttft_msfloatTime to first token
tokens_per_secondfloatGeneration throughput
successboolWhether completion succeeded
errorstrError message on failure
run_idstrCorrelated benchmark run ID

Source: OpenAICompatibleProvider.chat_completion() after success/failure.
Consumers: ResultsCollector, MetricsEngine, EventBusReplayWriter.

Emitted for any numeric measurement.

FieldTypeRequiredDescription
namestrMetric name (e.g. "general.mmlu_pro.overall_accuracy")
valuefloatNumeric value
unitstrUnit ("ms", "tokens/s")
tagsDict[str, str]Key-value metadata
modelstrModel identifier
run_idstrCorrelated benchmark run ID

Source: BenchmarkSuite.run_benchmark(), AppleSiliconBenchmarkV2.
Consumers: Dashboard, EventBusReplayWriter, ResultsCollector.

Emitted at run start, completion, or failure.

FieldTypeRequiredDescription
statusstr"started", "completed", or "failed"
modelstrModel identifier
providerstrProvider name
workloadstrWorkload identifier
run_idstrRun identifier (correlates all events)
duration_msfloatExecution duration (completed/failed)
errorstrError message (on failed)

Source: BenchmarkSuite.run_all(), AppleSiliconBenchmarkV2.
Consumers: EventBusReplayWriter, Dashboard, ResultsCollector.

Emitted for unexpected errors anywhere in the system.

FieldTypeRequiredDescription
messagestrHuman-readable error description
exceptionstrException type name
stack_tracestrFull stack trace
componentstrFailing component name
severitystr"debug", "info", "warning", "error", "critical"

Emitted when a skill/tool is invoked during agentic evaluation.

FieldTypeRequiredDescription
tool_namestrName of the invoked tool/skill
input_argsDict[str, Any]Arguments passed to the tool
modelstrModel identifier
run_idstrCorrelated benchmark run ID
RunLifecycleEvent(status="started") ← benchmark begins
├── TokenGeneratedEvent × N ← streaming tokens
├── ToolCallEvent × M ← tool/skill invocations
├── CompletionEvent ← response complete
├── MetricEvent × K ← scores and measurements
└── RunLifecycleEvent(status="completed") ← benchmark ends
from events import EventBus, TokenGeneratedEvent
bus = EventBus()
def on_token(event: TokenGeneratedEvent):
print(f"{event.model} → '{event.token}' at {event.timing_ms:.1f}ms")
bus.subscribe(TokenGeneratedEvent, on_token)
# Wildcard subscription:
def on_any(event):
print(f"[{type(event).__name__}] from {event.source}")
bus.subscribe_all(on_any)
ImplementationFile
Event type definitionspackages/events/__init__.py
SSE serverpackages/events/sse.py
Replay writerpackages/events/replay.py
Provider event emissionpackages/providers/openai_compatible.py
Suite lifecycle emissionpackages/core/benchmark.py