Skip to main content

Analytics

Every request through DVARA emits cost, latency, token, policy-decision, audit, and trace telemetry. That data fans out across three lenses — operator dashboards in the DVARA Flight Deck, self-service tenant views in the Portal, and raw machine-readable feeds (Prometheus, OpenTelemetry, structured logs, SIEM streams) you point your own observability stack at.

This page is a survey of what's available and where to find it. For deep dives, follow the links into the per-feature documentation.

The three lenses

LensAudienceWhat it gives you
Console dashboards (/, /costs, /audit, /latency, /analytics, …)Platform operators, security admins, financeCross-tenant operational view with HTMX-polled fragments (3–10s refresh), role-based access controls, and click-through detail. The fastest way to answer "what's happening right now across the fleet".
Portal pages (/portal, /portal/usage, /portal/audit, …)Tenant users — application developers, team leads, auditorsThe same dashboards narrowed to the caller's own tenant. Self-service for tenant admin / developer / viewer roles; the operator never has to grant Console access.
Raw feeds (Prometheus scrape, OTLP push, stdout JSON logs, Kafka/Splunk/CloudWatch SIEM)Your own observability stack — Grafana, Datadog, Tempo, Jaeger, Splunk, ELK, …Machine-readable telemetry meant to feed your existing dashboards, alerts, and SIEM. DVARA does not try to replace your observability platform; it emits the standard formats so you don't have to.

Use the dashboards for day-to-day operations and the raw feeds for long-term retention, custom dashboards, and alerting. Reports (chargeback, compliance) are the one-shot PDF / CSV artifacts you hand to finance or auditors.

Quick orientation by audience

You are a …Start here
Platform operator running the gatewayConsole Dashboard (/) for live health; Console Latency (/latency) for routing decisions; Prometheus scrape for alerting
Security / compliance leadConsole Audit (/audit) for the live event stream; Compliance Reports (/compliance) for SOC2 / HIPAA / GDPR rollups; SIEM stream for long-term retention
Finance / FinOpsConsole Cost Dashboard (/costs) for live spend; Chargeback Reports (/chargeback) for monthly per-tenant PDFs; gateway_cost_dollars_total Prometheus counter for time-series budget burn
Tenant developer or team leadPortal Dashboard (/portal) for your tenant's request counts, tokens, and cost; Portal Usage (/portal/usage) for per-model breakdowns; Portal Audit for your own audit trail
Agent platform teamConsole Agent Analytics (/analytics) for session outcomes, approval funnels, tool-call heatmaps; Console MCP Sessions (/mcp/sessions) for per-session timelines

The full surface

Every analytic the platform exposes, what it shows, where to view it, and the underlying endpoint.

Live dashboards

AnalyticWhat it showsWhere to viewEndpoint
Operational dashboardReq/s, P95 latency, error rate, tokens/min, provider health, cache stats, priority-admission statsConsole /GET /actuator/prometheus + GET /actuator/gateway-status
Per-tenant dashboardRequest count, token usage, 30-day cost, active API keys, recent audit events — scoped to caller's tenantPortal /portalPortalDashboardController (direct repo reads)
Cost dashboardTotal cost, Cost-by-Provider donut, Cost-by-Model bar, budget status, forecast cards, anomaly alerts, raw records tableConsole /costsGET /v1/admin/costs, /costs/summary, /costs/forecast, /costs/anomalies
Latency dashboardEWMA latency per (provider, model) — sample count, raw last sample, freshness badgeConsole /latency (owner + policy-admin)GET /v1/admin/latency
Priority admission statsConcurrent / max concurrent / load %, per-tier admitting vs throttlingDashboard priority card (when routing.priority.enabled=true)GET /v1/admin/priority/stats
Semantic cache statsHit rate, miss count, cache size, average similarity on hitsConsole /cache (owner only)GET /v1/admin/cache/stats
License statusLicensee, type, expiry, days remaining, runtime status badgeConsole /license, Portal license panelGET /actuator/gateway-status

Telemetry & event streams

AnalyticWhat it showsWhere to viewEndpoint
Token usage telemetryPer-request input/output/total tokens, summary KPI card, masked API key, estimated-flag tooltipConsole /token-usage, Portal /portal/usageGET /v1/admin/token-usage, …/summary
Audit event streamAppend-only HMAC-chained events — policy decisions, auth, config imports, license transitions; click-to-expand detail; CSV / JSON exportConsole /audit, Portal /portal/auditGET /v1/admin/audit/events, …/export, …/export/json
Budget cap usageCurrent period spend vs limit, color-coded progress bar, period boundaries, dollars remainingConsole /budgets, Portal /portal/budgetsGET /v1/admin/budgets/{id}/usage
MCP tool-call telemetryPer-call records (tenant, session, server, tool, status, latency, response bytes, PII flag); aggregated countsConsole /mcp/tool-calls, Portal /portal/mcp/tool-callsGET /v1/admin/mcp/tool-calls, …/summary
Agent session timelineSession list + detail with summary stats (avg latency, error rate, duration, response bytes) + distinct-tools table + per-call timeline + JSON exportConsole /mcp/sessions, Portal /portal/mcp/sessionsGET /v1/admin/sessions, …/{id}/timeline
Approval queue statsPending + decision history, action badges, nav badge countConsole /approvals, Portal /portal/approvalsGET /v1/admin/approvals/pending, …/history, …/count

Comparison & experimentation analytics

AnalyticWhat it showsWhere to viewEndpoint
Canary A/B reportPer-variant metrics on a canary-routed route; reset + live split-percentage updateConsole /routes/{id}/canaryGET /v1/admin/routes/{id}/canary/report
Shadow policy reportDivergence stats (1h / 24h / 7d) + per-event diff between active and SHADOW policyConsole /policies/{id}/shadowGET /v1/admin/policies/{id}/shadow/stats, …/events
Shadow routing reportProvider-vs-shadow comparison on a route (latency, status, response delta)Console /routes/{id}/shadowGET /v1/admin/routes/{id}/shadow/report
Prompt experiment metricsPer-variant A/B metrics snapshot for a prompt experimentConsole + Portal experiment pagesGET /v1/admin/prompts/experiments/{id}/report
Agent behaviour analyticsSession-outcomes donut, approval-funnel bar, top-20 agent leaderboard, tool-call heatmap (server × tool), policy firing frequencyConsole /analyticsAnalyticsController (composed from MCP + audit repos)
Model fingerprint driftDrift report per golden prompt — when an upstream model's response semantics shiftedConsole eval/fingerprint surfaceGET /v1/admin/golden-prompts/{id}/drift
Eval pipeline reportsPer-model eval suite results vs golden corpus, drift detection, time-ranged report queriesConsole /eval/reportsGET /v1/admin/eval/reports, …/model/{model}

Downloadable reports

AnalyticWhat it showsWhere to viewEndpoint
Chargeback reportsPer-tenant monthly rollup: tenant summary, model/provider breakdown, daily cost trend, forecasts, anomalies — PDF + CSVConsole /chargeback, Portal /portal/chargebackPOST /v1/admin/chargeback, GET …/{id}/pdf, …/{id}/csv
Compliance reportsSOC2 / HIPAA / GDPR rollups aggregated from audit + token usage + tenant + policy data — PDFConsole /compliance, Portal /portal/compliancePOST /v1/admin/reports, GET …/{id}/pdf

Raw feeds — point your own stack at these

AnalyticWhat it showsWhere to viewEndpoint
Prometheus metrics~35 named counters / histograms — gateway_requests_total, gateway_latency_seconds, gateway_tokens_total, gateway_cost_dollars_total, gateway_guardrail_blocked_total, gateway_budget_blocked_total, mcp_tool_calls_total, dvara_emails_sent_total, etc.Your Grafana / Datadog / MimirGET /actuator/prometheus on gateway-server, mcp-proxy-server, flightdeck — Bearer $DVARA_ACTUATOR_METRICS_API_KEY
OpenTelemetry tracesDistributed spans — gateway.provider.chat, …stream, …embed; mcp.filter.*, mcp.server.call — with traceparent propagationYour OTel collector → Jaeger / Tempo / DatadogOTLP push to OTEL_EXPORTER_OTLP_ENDPOINT (default http://localhost:4318/v1/traces)
Structured access logOne JSON line per request: trace_id, method, path, status, latency_ms, model, provider, cache_status, masked api_key, tenant_id, token counts, error_codeYour log aggregatorstdout (Logback LogstashEncoder)
SIEM streamAll audit events forwarded to Splunk HEC, CloudWatch Logs, or KafkaYour SIEMdvara.llm-gateway.siem.{splunk|cloudwatch|kafka}.*

Per-tenant vs platform scoping

Almost every analytic in the tables above respects DVARA's multi-tenant scoping rules:

  • Platform-role callers (owner / policy-admin / billing-admin) see cross-tenant data by default. Most list endpoints accept an optional ?tenant_id= filter to narrow.
  • Tenant-role callers (admin / developer / viewer) only ever see their own tenant's rows. Passing another tenant's id returns HTTP 403 with a TENANT_SCOPE_VIOLATION audit event.

The Portal pages enforce the same rule at the URL level — a tenant user lands at /portal/* and physically cannot construct a path that crosses tenants. See Multi-Tenancy for the full isolation model.

What's deliberately not in the box

A few things the platform deliberately leaves to your existing tools:

  • Long-term metric retention beyond the Prometheus scrape window — point a Mimir / Thanos / Datadog at the scrape endpoint.
  • Custom dashboards beyond the built-in Console / Portal views — Grafana on top of the Prometheus + OTLP feeds covers the long tail.
  • Alerting — Prometheus AlertManager or Grafana alerts on the same metrics; the platform emits the data, not the alert routing.
  • Cross-system correlation (joining DVARA spans against your service's app spans) — handled by your APM via the traceparent propagation DVARA already emits.

Where to go next

  • DVARA Flight Deck — the full Console tour, including every dashboard listed above with screenshots.
  • Cost Management — cost dashboard, forecasts, anomaly alerts, budget caps, chargeback reports.
  • Observability — audit event taxonomy, Prometheus metrics list, OpenTelemetry span names, SIEM configuration.
  • Multi-Tenancy — the scoping rules that govern who sees what data.
  • Admin API reference — programmatic access to every analytic above for CI/CD, IaC, and custom integrations.