Skip to content

Observability API Reference

The llm_rosetta.observability package provides reusable, framework-agnostic building blocks for metrics collection, request logging, SQLite persistence, and on-demand profiling. Any HTTP proxy built on top of llm-rosetta can import from this package directly — no dependency on the gateway's config system or HTTP server.

from llm_rosetta.observability import (
    MetricsCollector,
    PersistenceManager,
    ProfilerState,
    RequestLog,
    RequestLogEntry,
)

MetricsCollector

Lightweight in-process metrics. All data structures are plain Python objects — no framework dependencies. Designed for single-threaded asyncio event loops (no locks required).

metrics = MetricsCollector()
metrics.record_request(
    model="gpt-4o",
    source="openai_chat",
    target="anthropic",
    status_code=200,
    duration_ms=150.0,
    is_stream=False,
    provider_name="My Anthropic",
)
snapshot = metrics.snapshot(series_seconds=60)

Key methods

Method Description
record_request(...) Record a completed proxy request
snapshot(series_seconds) Return a JSON-serializable metrics snapshot
export_counters() Return counters for persistence (no time-series)
load_counters(data) Restore counters from exported dict
rebuild_counters(rows) Rebuild all counters from request log rows
provider_health_snapshot() Per-provider health status
any_critical_provider() True if any provider is critically unhealthy

RequestLogEntry

Frozen dataclass representing a single logged proxy request.

entry = RequestLogEntry.create(
    model="gpt-4o",
    source_provider="openai_chat",
    target_provider="anthropic",
    is_stream=False,
    status_code=200,
    duration_ms=123.4,
    target_provider_name="My Anthropic",
)

Fields

Field Type Description
id str Auto-generated UUID hex
timestamp str ISO 8601 timestamp
model str Model name
source_provider str Source API format
target_provider str Target API format
is_stream bool Whether streaming was used
status_code int HTTP status code
duration_ms float Request duration in milliseconds
error_detail str \| None Error message (if any)
api_key_label str \| None API key label
target_provider_name str \| None Provider display name
client_ip str \| None Client IP address
profile dict \| None Profiling data

RequestLog

Proxy request log with optional SQLite persistence. Delegates to PersistenceManager when available, otherwise falls back to an in-memory ring buffer.

# In-memory only
log = RequestLog(max_entries=500)

# With persistence
persistence = PersistenceManager("/var/data/myproxy")
log = RequestLog(persistence=persistence)

log.add(entry)
entries, total = log.get_entries(limit=50, status="error")

Key methods

Method Description
add(entry) Record a request log entry
get_entries(...) Paginated, filtered query (newest-first)
get_entry(entry_id) Single entry by ID
get_api_key_labels() Distinct API key labels
update_profile(entry_id, data) Merge profile data into existing entry
clear() Delete all entries

PersistenceManager

SQLite-backed persistence for request logs and metrics counters. Uses WAL journal mode and dual-threshold retention (success/error independently pruned).

pm = PersistenceManager(
    data_dir="/var/data/myproxy",
    success_max=50000,
    error_max=10000,
)

Key methods

Method Description
insert_log_entries(entries) Bulk insert request log entries
query_log_entries(...) Filtered query with pagination
save_metrics(data) Persist metrics counters
load_metrics() Load persisted metrics
count_log_entries() Total entry count
count_success_entries() Successful entries (status < 400)
count_error_entries() Error entries (status ≥ 400)
db_file_sizes() On-disk byte sizes
close() Commit and close the database

Retention defaults

Constant Default Description
DEFAULT_SUCCESS_MAX 50,000 Max successful entries
DEFAULT_ERROR_MAX 10,000 Max error entries

ProfilerState

Manages on-demand per-request pyinstrument profiling sessions. Framework-agnostic data layer — route handlers that wire ProfilerState into a web framework live in the consumer.

state = ProfilerState(max_results=20)
state.enable(requests=5)

if state.should_profile():
    profiler = state.create_profiler()
    profiler.start()
    # ... do work ...
    profiler.stop()
    state.store_result(profiler, model="gpt-4o", duration_ms=150.0)

Key methods

Method Description
enable(requests) Enable profiling for next N requests
disable() Manually disable
should_profile() Check and consume one profiling slot
create_profiler() Create a DeepProfiler instance
store_result(profiler, ...) Store profiling result
status() Current profiling status dict
clear_results() Remove all stored results

Backward compatibility

All classes remain importable from their original gateway.admin locations:

# These still work (re-exports from observability)
from llm_rosetta.gateway.admin.metrics import MetricsCollector
from llm_rosetta.gateway.admin.request_log import RequestLog, RequestLogEntry
from llm_rosetta.gateway.admin.persistence import PersistenceManager
from llm_rosetta.gateway.admin.routes.profiling import ProfilerState

New code should import from llm_rosetta.observability directly.