Model Lens

Observability-first platform for local AI

pip install -r requirements.txt
python apps/cli/modellens.py run --quick

Why Model Lens?

Most local AI tooling focuses on a single layer — benchmarking, chat, model serving, or agents. Very few tools help you understand what a model is actually doing on your machine.

Model Lens gives you the full picture: execution traces, latency profiles, memory footprints, and real-world workload evaluations.

🔬 Trace & Replay

Capture token-level execution timelines. Replay with play, pause, step, and speed controls.

📊 Compare Models

Side-by-side latency, memory, and output diffs. Understand why one model outperforms another.

🏗️ Real Workloads

Test models on actual codebases — React, NestJS, Python, Rust — not just synthetic benchmarks.

🔌 6 Providers

LM Studio, Ollama, Open WebUI, Jan, llama.cpp, vLLM — all through OpenAI-compatible /v1 endpoints.

📡 Live Streaming

SSE bridge streams real-time events to the dashboard. Every token, every metric, live.

🛡️ CI-Enforced Quality

Ruff lint, ruff format, mypy type checking, and strict pytest on every push.

Architecture at a glance

Provider calls → Event Bus (thread-safe)
                   ├── SSE bridge → Dashboard (real-time)
                   ├── Replay writer → Disk (historical)
                   └── Trace capture → Metrics engine

Every streaming token, every completion, every metric, and every benchmark lifecycle event flows through the central event bus. Consumers — the dashboard, replay engine, and metrics system — subscribe to what they need.

Model Lens

Why Model Lens?

🔬 Trace & Replay

📊 Compare Models

🏗️ Real Workloads

🔌 6 Providers

📡 Live Streaming

🛡️ CI-Enforced Quality

Architecture at a glance

Quick navigation

🚀 Quick Start

🏛️ Architecture

🔌 Providers

📏 Benchmarks

⌨️ CLI Reference

🤝 Contributing