Provider Contract
Canonical contract for provider adapters. Every provider in Model Lens implements this interface.
Interface: ProviderAdapter
Section titled “Interface: ProviderAdapter”Defined in packages/providers/base.py.
Class attributes
Section titled “Class attributes”| Attribute | Type | Required | Description |
|---|---|---|---|
name | str | ✓ | Canonical provider name (e.g. "ollama", "vllm") |
default_port | int | ✓ | Default port for the provider’s API |
Methods
Section titled “Methods”health_check() -> bool
Section titled “health_check() -> bool”Check if the provider is reachable and healthy.
- Must not raise exceptions — all failures return
False - Should time out quickly (≤5 seconds)
- Catch
requests.ConnectionErrorandrequests.Timeoutspecifically
list_models() -> List[Model]
Section titled “list_models() -> List[Model]”Return all models available on this provider.
- Returns empty list on failure (never raises)
- Model metadata:
id,name,provider,parameters,quantization,size_bytes
run_prompt(request: RunRequest) -> RunResult
Section titled “run_prompt(request: RunRequest) -> RunResult”Run a single prompt and return structured results.
chat_completion(messages, temperature, max_tokens, top_p, stream) -> Tuple[str, APICallMetrics]
Section titled “chat_completion(messages, temperature, max_tokens, top_p, stream) -> Tuple[str, APICallMetrics]”OpenAI-compatible chat completion. The core execution method.
collect_metrics() -> ProviderMetrics
Section titled “collect_metrics() -> ProviderMetrics”Collect hardware/performance metrics from the provider process.
- Has a default implementation in the ABC (psutil-based)
- Override for provider-specific metrics (GPU, etc.)
Data types
Section titled “Data types”| Field | Type | Description |
|---|---|---|
id | str | Unique model identifier |
name | str | Human-readable name |
provider | str | Provider serving this model |
parameters | str | Parameter count / tag (e.g. "7B", "latest") |
quantization | str | Quantization level (e.g. "Q4_K_M") |
size_bytes | int | Model file size in bytes |
RunRequest
Section titled “RunRequest”| Field | Type | Default | Description |
|---|---|---|---|
prompt | str | (required) | The user prompt |
model | str | (required) | Model to use |
temperature | float | 0.0 | Sampling temperature |
max_tokens | int | 4096 | Max completion tokens |
top_p | float | 1.0 | Nucleus sampling |
system_prompt | str | None | System prompt |
stream | bool | False | Enable streaming |
RunResult
Section titled “RunResult”| Field | Type | Description |
|---|---|---|
response | str | Full response text |
model | str | Model used |
provider | str | Provider used |
ttft_ms | float | Time to first token (ms) |
total_time_ms | float | Total execution time (ms) |
tokens_per_second | float | Generation throughput |
prompt_tokens | int | Prompt token count |
completion_tokens | int | Completion token count |
total_tokens | int | Total tokens used |
APICallMetrics
Section titled “APICallMetrics”| Field | Type | Description |
|---|---|---|
ttft | float | Time to first token (seconds) |
total_time | float | Total generation time (seconds) |
tokens_per_second | float | Throughput |
total_tokens | int | Total tokens |
prompt_tokens | int | Prompt tokens |
completion_tokens | int | Completion tokens |
ProviderMetrics
Section titled “ProviderMetrics”| Field | Type | Description |
|---|---|---|
cpu_percent | float | CPU usage percentage |
ram_used_mb | float | RAM used (MB) |
ram_total_mb | float | Total system RAM (MB) |
gpu_available | bool | Whether GPU is detected |
gpu_used_mb | float | GPU memory used (MB) |
swap_used_mb | float | Swap used (MB) |
Implementation checklist
Section titled “Implementation checklist”To add a new provider:
- Create
packages/providers/<your_provider>.py - Implement
ProviderAdapter:from .base import ProviderAdapter, Model, RunRequest, RunResult, APICallMetricsclass YourProvider(ProviderAdapter):name = "your-provider"default_port = 8080def health_check(self) -> bool: ...def list_models(self) -> List[Model]: ...def run_prompt(self, request: RunRequest) -> RunResult: ...def chat_completion(self, messages, ...) -> tuple[str, APICallMetrics]: ... - Register in
packages/providers/__init__.py - Add provider entry in
apps/cli/commands/utils.py(PROVIDER_CONFIG) - Add
--providerchoice inapps/cli/commands/run.py - Add auto-detection probe in
_resolve_provider() - Add tests in
tests/test_provider_clients.py
URL utility contract
Section titled “URL utility contract”All providers MUST use these helpers from packages/providers/base.py:
| Function | Purpose | Example |
|---|---|---|
normalize_base_url(url) | Strip trailing / for storage | "http://host:8000/v1/" → "http://host:8000/v1" |
get_root_url(url) | Extract scheme://netloc | "http://host:8000/v1" → "http://host:8000" |
url_join(base, path) | Safe URL path joining | url_join("http://host:8000/", "health") → "http://host:8000/health" |
Banned patterns: .rstrip("/"), .removesuffix("/v1"), f"{base}/{path}" string concatenation.
Event bus integration
Section titled “Event bus integration”Providers that extend OpenAICompatibleProvider automatically emit:
TokenGeneratedEventper streaming tokenCompletionEventafter success/failureErrorEventon failure
Providers must accept event_bus and event_source in their constructor (via OpenAICompatibleProvider.__init__).
Related
Section titled “Related”- Event Schema — event bus contract
- Architecture — Providers — how providers fit in
- Providers Guide — setup guides for each provider