Version: 1.3.0

Observability

DVARA provides structured JSON logging, Prometheus metrics, token usage metering, and request tracing out of the box.

X-Trace-ID Propagation

Every response includes an X-Trace-ID header for request correlation. This applies to both the LLM gateway (port 8080) and the MCP Proxy (port 8070).

Behavior:

If the incoming request includes an X-Trace-ID header, the same value is echoed back.
When OpenTelemetry tracing is active, the OTel 32-character hex trace ID is used if no client header is present.
Otherwise, the gateway generates a new random 32-character hex ID.
The trace ID is embedded in every error response body as error.trace_id.
The trace ID is added to SLF4J MDC as trace_id for inclusion in all structured log lines.

# With custom trace ID
curl -i -H "X-Trace-ID: my-custom-trace-001" http://localhost:8080/v1/models
# → X-Trace-ID: my-custom-trace-001

# Without trace ID (auto-generated)
curl -i http://localhost:8080/v1/models
# → X-Trace-ID: a6783439db1f46a6bfed511a0011e955

X-Session-Id Header

Both the LLM gateway and MCP Proxy accept an optional X-Session-Id header for agent session correlation. When present, the session ID is:

Stored as a servlet request attribute (sessionId)
Added to SLF4J MDC as session_id for structured log correlation
Attached as a high-cardinality attribute on OpenTelemetry spans

This allows traces from multiple LLM turns and MCP tool calls within a single agent session to be correlated by session ID.

# LLM request with session ID
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-Session-Id: agent-session-42" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}'

# MCP request with same session ID
curl http://localhost:8070/mcp/filesystem/tools/call \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer gw_mykey" \
  -H "X-Session-Id: agent-session-42" \
  -d '{"name":"read_file","arguments":{"path":"/data/file.txt"}}'

Structured JSON Logging

DVARA emits structured JSON logs by default. Every log line is a JSON object, ready to ingest into ELK, Loki, Datadog, Splunk, or any other JSON-aware log pipeline.

Configuration

JSON mode (default) — Every log line is a JSON object with @timestamp, level, message, logger_name, and MDC fields.
Plain-text mode — Human-readable output for local development. Activate with the log-plain profile (SPRING_PROFILES_ACTIVE=log-plain).

Log fields

Every access log line and every gateway log line within the scope of a request carry these fields:

Field	Description
`trace_id`	Request correlation ID (see above)
`session_id`	Agent session ID (from `X-Session-Id` header, if present)
`tenant_id`	Tenant identifier
`model`	Requested model name
`provider`	Selected provider
`method`	HTTP method
`path`	Request URI
`status`	HTTP response status
`latency_ms`	Request duration in milliseconds
`api_key`	API key (masked to first 8 chars)
`cache_status`	`HIT` or `MISS`
`stream`	Whether request was streaming
`tokens_prompt`	Prompt token count
`tokens_completion`	Completion token count
`tokens_total`	Total token count
`error_code`	Gateway error code (if any)
`priority_tier`	Priority admission tier the request was admitted under (`premium` / `standard` / `bulk`), present only when priority admission control is enabled

Access Log Example

Every request produces a single structured access log entry:

{
  "@timestamp": "2026-02-25T10:23:45.123Z",
  "level": "INFO",
  "message": "Request completed",
  "logger_name": "dvara.access",
  "trace_id": "a6783439db1f46a6bfed511a0011e955",
  "tenant_id": "acme-corp",
  "model": "gpt-4o",
  "provider": "openai",
  "method": "POST",
  "path": "/v1/chat/completions",
  "status": "200",
  "latency_ms": "342",
  "api_key": "sk-prod-1...",
  "cache_status": "MISS",
  "tokens_prompt": "150",
  "tokens_completion": "85",
  "tokens_total": "235",
  "service": "dvara-gateway"
}

Prometheus Metrics

DVARA exposes a Prometheus-format metrics endpoint out of the box.

Scrape Endpoint

GET /actuator/prometheus
Authorization: Bearer $DVARA_ACTUATOR_METRICS_API_KEY

The endpoint is authenticated — configure your Prometheus job's bearer_token_file to point at the DVARA_ACTUATOR_METRICS_API_KEY value. The metrics secret is intentionally distinct from DVARA_ACTUATOR_API_KEY (which guards /actuator/gateway-status) so a leaked scrape token can't unlock the license envelope. A worked Prometheus scrape config (bearer_token_file placement, metrics_path: /actuator/prometheus) is not included here; for now, configure Prometheus per its standard documentation pointing at the gateway's /actuator/prometheus endpoint with the metrics-key bearer.

Available Metrics

Core Metrics

Metric	Type	Labels	Description
`gateway_requests_total`	Counter	tenant, model, provider, status, region	Total gateway requests
`gateway_latency_seconds`	Histogram	tenant, model, provider, status, region	Request latency (P50/P95/P99)
`gateway_tokens_total`	Counter	tenant, model, direction	Token usage (`direction`=input/output)
`gateway_provider_errors_total`	Counter	provider, error_code	Provider errors
`gateway_retries_total`	Counter	provider	Retry attempts
`gateway_fallbacks_total`	Counter	from_provider, to_provider	Fallback activations

Policy & Routing Metrics

Metric	Type	Labels	Description
`gateway_policy_shadow_divergence_total`	Counter	policy_id, rule_id, divergence_type	Shadow policy divergence events
`gateway_canary_requests_total`	Counter	route_id, variant, model	Canary A/B test requests
`gateway_shadow_requests_total`	Counter	route_id, primary_provider, shadow_provider	Shadow traffic routing events
`gateway_priority_requests_total`	Counter	tenant, tier	Priority-routed requests
`gateway_priority_throttled_total`	Counter	tenant, tier	Priority admission rejections

FinOps Metrics

Metric	Type	Labels	Description
`gateway_cost_dollars_total`	Counter	tenant, model, provider	Cumulative cost in USD
`gateway_budget_blocked_total`	Counter	tenant, budget_id	Hard budget cap rejections
`gateway_budget_soft_alert_total`	Counter	tenant, budget_id	Soft budget cap alerts
`gateway_budget_warning_total`	Counter	tenant, budget_id	Budget warning policy triggers
`gateway_model_downgrades_total`	Counter	tenant, original_model, downgraded_model	Automatic model downgrades
`gateway_cost_anomaly_total`	Counter	tenant, model	Cost anomaly detections

Guardrail & Security Metrics

Metric	Type	Labels	Description
`gateway_guardrail_blocked_total`	Counter	tenant, category	Guardrail BLOCK actions
`gateway_guardrail_flagged_total`	Counter	tenant, category	Guardrail FLAG actions
`gateway_schema_validations_total`	Counter	tenant, model, result	Output schema validations
`gateway_schema_retries_total`	Counter	tenant, model	Schema validation retries
`gateway_context_window_warnings_total`	Counter	tenant, model	Context window threshold warnings
`gateway_context_window_pruned_total`	Counter	tenant, model, strategy	Context window pruning events
`gateway_mcp_injection_detections_total`	Counter	tenant, server_id, action	MCP injection detections
`gateway_ml_guardrail_total`	Counter	provider, category, action	ML-classifier guardrail decisions (Lakera, ShieldGemini, Bedrock Guardrails, Aporia, onnx-injection)
`gateway_plugin_guardrail_total`	Counter	plugin, category, action	External guardrail plugin decisions
`gateway_grounding_check_total`	Counter	grounded, action	Embedding-based grounding check outcomes

Intelligent Routing & Config Metrics

Metric	Type	Labels	Description
`gateway_intelligent_routing_total`	Counter	complexity, selected_model	Intelligent routing selections by complexity tier
`gateway_config_import_total`	Counter	mode, dry_run	Config import operations via the Automation API

Flightdeck-only metrics

These counters live in the flightdeck app's MeterRegistry (not in gateway-server's), so they're only scrapeable from flightdeck:8090/actuator/prometheus with the flightdeck's own DVARA_ACTUATOR_METRICS_API_KEY. Configure a second Prometheus job pointed at the flightdeck pod if you want email-pipeline health on your dashboards.

Metric	Type	Labels	Description
`dvara_emails_sent_total`	Counter	template, transport, result	Email send attempts. `result` ∈ `SUCCESS` / `TRANSIENT` / `PERMANENT` / `MAX_ATTEMPTS_EXCEEDED`. `transport` ∈ `log` / `smtp` / `resend`.
`dvara_emails_retried_total`	Counter	template, attempt	Retries scheduled by the durability layer (counts retry attempt N, not the original send).

MCP Proxy Metrics

Metric	Type	Labels	Description
`mcp_tool_calls_total`	Counter	tenant, server_id, tool_name, status	MCP tool call count
`mcp_tool_call_latency_seconds`	Histogram	server_id, tool_name	MCP tool call latency
`mcp_approval_requests_total`	Counter	tenant, server_id, tool_name	Approval gate requests
`mcp_approval_granted_total`	Counter	tenant	Approvals granted
`mcp_approval_denied_total`	Counter	tenant	Approvals denied
`mcp_approval_timeout_total`	Counter	tenant	Approval timeouts
`mcp_agent_loop_detected_total`	Counter	tenant, loop_type	Agent loop detection fires
`mcp_agent_sessions_killed_total`	Counter	tenant	Agent sessions killed

Configuration

Metrics are enabled by default in application.yml. The exact exposure.include list differs per app:

# gateway-server (port 8080)
management:
  endpoints:
    web:
      exposure:
        include: health,prometheus,gateway-status,info
  prometheus:
    metrics:
      export:
        enabled: true

App	Default `include`
`gateway-server` (port 8080)	`health,prometheus,gateway-status,info`
`mcp-proxy-server` (port 8070)	`health,prometheus,info` (no `gateway-status` — that endpoint is gateway-server-only)
`flightdeck` (port 8090)	`health,prometheus,info`

Dropping gateway-status from the gateway-server list breaks the Flightdeck License page (it reads license metadata from /actuator/gateway-status); dropping info breaks anonymous build-info polling by uptime monitors.

Grafana Dashboard

Example PromQL queries:

# Request rate by provider
rate(gateway_requests_total[5m])

# P95 latency by model
histogram_quantile(0.95, rate(gateway_latency_seconds_bucket[5m]))

# Token throughput by tenant
rate(gateway_tokens_total[5m])

# Error rate by provider
rate(gateway_provider_errors_total[5m])

Token Usage Metering

Every non-streaming chat request records token usage to the DVARA token ledger, backed by PostgreSQL. Records are queryable through the Automation API below.

Query Endpoints

# List all token usage records
curl http://localhost:8090/v1/admin/token-usage

# Filter by tenant
curl http://localhost:8090/v1/admin/token-usage?tenantId=acme-corp

# Filter by API key
curl http://localhost:8090/v1/admin/token-usage?apiKey=sk-prod-123

# Filter by model
curl http://localhost:8090/v1/admin/token-usage?model=gpt-4o

# Aggregated summary
curl "http://localhost:8090/v1/admin/token-usage/summary?tenantId=acme-corp&model=gpt-4o"

Record Fields

Field	Type	Description
`id`	string	Unique record ID (UUID)
`tenantId`	string	Tenant identifier
`apiKey`	string	API key used
`model`	string	Model requested
`provider`	string	Provider that served the request
`inputTokens`	int	Prompt tokens
`outputTokens`	int	Completion tokens
`totalTokens`	int	Total tokens
`estimated`	boolean	Whether counts are estimated
`timestamp`	ISO 8601	When the request was made

Summary Response

{
  "tenantId": "acme-corp",
  "model": "gpt-4o",
  "totalInputTokens": 15000,
  "totalOutputTokens": 8500,
  "totalTokens": 23500,
  "requestCount": 42
}

Pre-Built Grafana Dashboards

Five production-ready Grafana dashboards ship in the dvarahq/dvara-examples repo under grafana/dashboards/ — not in the gateway distribution itself. Clone or download that repo to use them; every path below is relative to its root.

Dashboard	File	Description
Gateway Overview	`dvara-overview.json`	Request volume, latency (P50/P95/P99), error rates, provider health, token usage, cost/hour
FinOps & Budget	`dvara-finops.json`	Cost by tenant/model/provider, budget enforcement, model downgrades, anomalies
MCP Proxy & Agentic	`dvara-mcp.json`	Tool calls, agent sessions, loop detection, approval gates, injection detection
Policy, Routing & Fleet	`dvara-policy-routing.json`	Shadow policy divergence, canary testing, priority routing, config sync, fleet health
Infrastructure	`dvara-infrastructure.json`	JVM, connection pool, distributed cache, PostgreSQL — gateway-host process health

One-Command Setup

From inside the cloned dvara-examples repo:

docker compose -f docker-compose.yml -f grafana/docker-compose.monitoring.yml up

This starts Prometheus (port 9090) and Grafana (port 3000, admin/dvara) with dashboards auto-provisioned.

Manual Import

for f in grafana/dashboards/*.json; do
  curl -X POST http://admin:dvara@localhost:3000/api/dashboards/db \
    -H "Content-Type: application/json" \
    -d "{\"dashboard\": $(cat "$f"), \"overwrite\": true}"
done

Alerting Rules

grafana/alerts/dvara-alerts.yml in dvarahq/dvara-examples defines 12 Prometheus alerting rules:

Alert	Severity	Trigger
`DvaraHighErrorRate`	critical	Error rate > 5% for 5 min
`DvaraHighP95Latency`	warning	P95 > 5s for 5 min
`DvaraProviderErrorSpike`	warning	Provider errors > 1/sec for 3 min
`DvaraCircuitBreakerOpen`	critical	Errors with zero successes for 2 min
`DvaraBudgetHardLimit`	critical	Hard budget cap hit
`DvaraBudgetSoftLimit`	warning	Soft limit breached
`DvaraCostAnomaly`	warning	Cost exceeds baseline
`DvaraGuardrailBlocks`	warning	Guardrail blocks > 0.1/sec
`DvaraAgentLoopDetected`	warning	Agent loop detected
`DvaraApprovalTimeouts`	warning	Approval gate timeouts
`DvaraMcpToolErrorRate`	warning	MCP error rate > 10%
`DvaraInjectionDetected`	critical	Prompt injection detected

Alert names are Title-cased Dvara*, not all-caps DVARA* — match them exactly when wiring into PagerDuty / OpsGenie routing rules or runbook entries.

Datadog Integration

The Datadog assets — Agent config and pre-built monitors — also ship in the dvarahq/dvara-examples repo, under datadog/. Clone that repo to use them.

OpenMetrics Scraping

Copy the Datadog Agent config to scrape all DVARA Prometheus metrics:

cp datadog/conf.d/dvara.yaml /etc/datadog-agent/conf.d/openmetrics.d/dvara.yaml
sudo systemctl restart datadog-agent

OTLP Traces to Datadog

Configure the Datadog Agent as an OTLP collector, then point DVARA to it:

OTEL_EXPORTER_OTLP_ENDPOINT=http://datadog-agent:4318/v1/traces

Pre-built Datadog monitors are provided in datadog/monitors.yaml.

Health Endpoints

The gateway uses a two-key model for authenticated actuator endpoints, with the probe paths left anonymous so container orchestrators don't need credentials. The worked Prometheus bearer_token_file scrape config is not included here.

Endpoint	Auth	Description
`GET /actuator/health`	Anonymous (permitAll)	Liveness summary. Returns `{"status":"UP\|DOWN"}` only — detail is gated by `management.endpoint.health.show-details=when-authorized`.
`GET /actuator/health/liveness`	Anonymous (k8s probe)	Liveness probe target.
`GET /actuator/health/readiness`	Anonymous (k8s probe)	Readiness probe target — fails closed when the database, license, or any other registered indicator reports DOWN.
`GET /actuator/info`	Anonymous	Build info (`app.`, `build.`).
`GET /actuator/gateway-status`	`Authorization: Bearer $DVARA_ACTUATOR_API_KEY`	Rich gateway status: mode, version, providers, routes, rate limits, license, warnings, uptime. Powers the DVARA Flightdeck License page.
`GET /actuator/prometheus`	`Authorization: Bearer $DVARA_ACTUATOR_METRICS_API_KEY`	Prometheus scrape endpoint. The metrics secret is intentionally distinct from `DVARA_ACTUATOR_API_KEY` — by the principle of least privilege, a leaked scrape token must not unlock the gateway-status surface.

/actuator/gateway-status exposes metadata only

Even with DVARA_ACTUATOR_API_KEY, the gateway-status endpoint returns license metadata (licensee name, expiry, ID, runtime status) — never the raw DVARA_LICENSE_KEY envelope. Once validated at startup the envelope stays in process memory; no API surfaces it. Provider API keys, vault credentials, and the audit HMAC secret are likewise never returned by this endpoint — only operator-facing metadata.

| GET /v1/models | Tenant API key | Lists all registered providers and models with capabilities. | | GET /v1/admin/providers/\{id\}/capabilities | Automation API auth | Provider-specific capability details. |

/actuator/env, /heapdump, /threaddump, /beans, /mappings, /configprops, /loggers, /scheduledtasks, /caches, /sessions, and /quartz are excluded from the registry and return 404 regardless of authentication.

OpenTelemetry Distributed Tracing

DVARA includes OpenTelemetry distributed tracing out of the box. Traces are automatically created for every request and provider call, with W3C traceparent headers propagated to upstream LLM providers.

How It Works

Both the DVARA LLM Gateway and the DVARA MCP Proxy have full OTLP tracing support:

Server spans — one per incoming HTTP request
LLM provider spans — one per provider call (chat, streamChat, embed), enriched with token usage and session ID
MCP filter spans — one per stage of the MCP filter pipeline (server registry lookup, policy evaluation, PII scanning, upstream server call)
Client spans — one per outbound HTTP call, with W3C traceparent headers propagated to upstream LLM and MCP providers

LLM Span Hierarchy

HTTP POST /v1/chat/completions              (server span)
  └── gateway.provider.chat                 (gateway observation)
       ├── low-card: provider, model
       ├── high-card: input_tokens, output_tokens, session_id
       └── HTTP POST https://api.openai.com (client span)

MCP Span Hierarchy

HTTP POST /mcp/{serverId}/tools/call        (server span)
  └── dvara.mcp-gateway.request                   (parent MCP observation)
       ├── low-card: server_id, operation, tool_name
       ├── high-card: session_id, latency_ms, response_bytes, pii_in_response
       │
       ├── mcp.filter.registry              (low-card: server_id)
       ├── mcp.filter.policy                (low-card: decision)
       ├── mcp.filter.pii_args              (low-card: action)
       ├── mcp.server.call                  (upstream HTTP call)
       │    ├── low-card: server_id, operation, tool_name
       │    ├── high-card: http_status, latency_ms, response_bytes
       │    ├── event: mcp_request_sent
       │    └── event: mcp_response_received
       └── mcp.filter.pii_response          (low-card: action, conditional)

LLM Span Names and Attributes

Observation Name	Operation	Low-Cardinality	High-Cardinality
`gateway.provider.chat`	Non-streaming chat	`provider`, `model`	`input_tokens`, `output_tokens`, `session_id`
`gateway.provider.stream`	Streaming chat	`provider`, `model`	`session_id`
`gateway.provider.embed`	Embeddings	`provider`, `model`	—

These same spans also carry OpenTelemetry GenAI semantic-convention attributes (gen_ai.*), so LLM-aware backends (Langfuse, Arize Phoenix, …) render model and token usage natively — see GenAI semantic-convention attributes below.

MCP Span Names and Attributes

Observation Name	Operation	Low-Cardinality	High-Cardinality
`dvara.mcp-gateway.request`	Parent MCP request	`server_id`, `operation`, `tool_name`	`session_id`, `latency_ms`, `response_bytes`, `pii_in_response`
`mcp.filter.registry`	Server registry lookup	`server_id`	—
`mcp.filter.policy`	Policy evaluation	`decision`	—
`mcp.filter.pii_args`	PII scan on request args	`action`	—
`mcp.filter.pii_response`	PII scan on response	`action`	—
`mcp.server.call`	Upstream HTTP call	`server_id`, `operation`, `tool_name`	`http_status`, `latency_ms`, `response_bytes`

Configuration

management:
  tracing:
    sampling:
      probability: ${TRACING_SAMPLING_PROBABILITY:1.0}  # 0.0–1.0, default: sample all
  opentelemetry:
    tracing:
      export:
        otlp:
          endpoint: ${OTEL_EXPORTER_OTLP_ENDPOINT:http://localhost:4318}

Environment Variable	Default	Description
`TRACING_SAMPLING_PROBABILITY`	`1.0`	Fraction of traces to sample (0.0 = none, 1.0 = all)
`OTEL_EXPORTER_OTLP_ENDPOINT`	`http://localhost:4318`	OTLP base URL for trace export

The OTLP endpoint is a base URL — no /v1/traces suffix

OTEL_EXPORTER_OTLP_ENDPOINT is the collector's base URL (e.g. http://otel-collector:4318); the OpenTelemetry SDK appends the /v1/traces signal path itself. Do not include /v1/traces — that produces POST …/v1/traces/v1/traces and the collector rejects it (405). The Spring property is management.opentelemetry.tracing.export.otlp.endpoint.

X-Trace-ID Integration

When OpenTelemetry tracing is active, the X-Trace-ID response header uses the OTel 32-character hex trace ID instead of a random UUID. Client-supplied X-Trace-ID headers still take precedence.

Scenario	X-Trace-ID Value
Client sends `X-Trace-ID` header	Client's value (echoed back)
OTel tracing active, no client header	OTel trace ID (32-char hex)
No tracing, no client header	Random UUID hex (32-char)

Viewing Traces

Start a local Jaeger instance for trace visualization:

# Start Jaeger with OTLP collector
docker run -d --name jaeger -p 16686:16686 -p 4318:4318 jaegertracing/all-in-one:latest

# Start the gateway with traces auto-exported to localhost:4318 (base URL — no /v1/traces)
docker run -d --name dvara-gateway \
  -p 8080:8080 \
  -e MOCK_PROVIDER_ENABLED=true \
  -e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4318 \
  ghcr.io/dvarahq/dvara/dvara-llm-gateway:1.3.0

# Send a request
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"mock/test","messages":[{"role":"user","content":"Hello"}]}'

# View traces at http://localhost:16686

GenAI semantic-convention attributes (`gen_ai.*`)

DVARA emits OpenTelemetry GenAI semantic-convention attributes on the LLM provider spans, alongside the native provider / model / token attributes. GenAI-aware backends (Langfuse, Arize Phoenix, Weave, …) then parse model, request parameters, and token usage natively — no vendor-specific exporter, just point OTLP at them.

Span	`gen_ai.*` attributes
`gateway.provider.chat`	`gen_ai.system`, `gen_ai.request.model`, `gen_ai.operation.name` (`chat`), `gen_ai.request.max_tokens`, `gen_ai.request.temperature`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, `gen_ai.response.model`, `gen_ai.response.id`, `gen_ai.response.finish_reasons`
`gateway.provider.stream`	`gen_ai.system`, `gen_ai.request.model`, `gen_ai.operation.name` (`chat`), `gen_ai.request.max_tokens`, `gen_ai.request.temperature`
`gateway.provider.embed`	`gen_ai.system`, `gen_ai.request.model`, `gen_ai.operation.name` (`embeddings`)

gen_ai.system maps the DVARA provider to its semantic-convention value where one is enumerated — openai, anthropic, aws.bedrock, az.ai.openai, gcp.gemini, cohere, mistral_ai, deepseek, groq, xai — and falls back to the lower-cased provider name for the OpenAI-compatible long-tail (Qwen, Moonshot, ChatGLM, Ollama, …).

To suppress the gen_ai.* attributes (e.g. if the extra span cardinality is a concern), set management.tracing.genai-semconv.enabled: false (default true). The native provider / model / token attributes remain regardless.

Sending traces to Arize Phoenix

Phoenix ingests OTLP directly — run it locally, point DVARA at it, and the gen_ai.* attributes appear natively.

# Start Phoenix (UI + OTLP receiver both on 6006)
docker run -d --name phoenix -p 6006:6006 arizephoenix/phoenix:latest

# Point the gateway at Phoenix — base URL only (the SDK appends /v1/traces)
docker run -d --name dvara-gateway \
  -p 8080:8080 \
  -e MOCK_PROVIDER_ENABLED=true \
  -e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:6006 \
  ghcr.io/dvarahq/dvara/dvara-llm-gateway:1.3.0

# Send a request, then open the Phoenix UI
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"mock/test","messages":[{"role":"user","content":"Hello"}]}'

# View traces at http://localhost:6006 — the gateway.provider.chat span shows
# gen_ai.system / gen_ai.request.model / gen_ai.usage.* natively.

For a hosted Phoenix, set OTEL_EXPORTER_OTLP_ENDPOINT to the instance's base URL.

Sending traces to Langfuse

Langfuse exposes an OTLP endpoint at /api/public/otel; traces authenticate with a project key pair sent as an HTTP Basic Authorization header via the standard OpenTelemetry headers variable.

docker run -d --name dvara-gateway \
  -p 8080:8080 \
  -e OTEL_EXPORTER_OTLP_ENDPOINT=https://cloud.langfuse.com/api/public/otel \
  -e OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic <base64(public-key:secret-key)>" \
  ghcr.io/dvarahq/dvara/dvara-llm-gateway:1.3.0

Langfuse Cloud — https://cloud.langfuse.com/api/public/otel (EU) or https://us.cloud.langfuse.com/api/public/otel (US); the header value is Basic + base64 of <public-key>:<secret-key>.
Self-hosted — use your instance's /api/public/otel base URL.

The same gen_ai.* attributes surface in Langfuse's model / usage views.

Verification

The Phoenix flow above is verified end-to-end. The Langfuse endpoint and header values follow Langfuse's published OTLP integration — confirm the header binding in your own deployment.

Log Correlation

When tracing is active, OTel trace and span IDs are automatically added to every log line in the request scope:

{
  "trace_id": "abcdef1234567890abcdef1234567890",
  "traceId": "abcdef1234567890abcdef1234567890",
  "spanId": "1234567890abcdef",
  "span_id": "1234567890abcdef",
  "message": "Request completed",
  ...
}

Disabling Tracing

Set the sampling probability to 0.0 to disable trace collection while keeping the infrastructure in place:

TRACING_SAMPLING_PROBABILITY=0.0 docker run -d ghcr.io/dvarahq/dvara/dvara-llm-gateway:1.3.0

Audit Event Stream

Every API request through /v1/* generates audit events that are persisted and queryable.

How it works

On the way out, the gateway writes a GATEWAY_RESPONSE audit event with a rich payload: model, provider, HTTP status, latency, tokens, masked API key, tenant ID, policy decision, and error code.
Every event is both logged to stdout (for your log pipeline) and persisted to the audit store.

Query Endpoints

# List all audit events (newest first)
curl http://localhost:8090/v1/admin/audit/events

# Filter by tenant
curl http://localhost:8090/v1/admin/audit/events?tenant_id=acme-corp

# Filter by event type
curl http://localhost:8090/v1/admin/audit/events?event_type=GATEWAY_RESPONSE

# Filter by date range
curl "http://localhost:8090/v1/admin/audit/events?from=2026-01-01T00:00:00Z&to=2026-01-02T00:00:00Z"

# Export as CSV
curl -o audit.csv http://localhost:8090/v1/admin/audit/events/export

# Export as JSON
curl -o audit.json http://localhost:8090/v1/admin/audit/events/export/json

Event Fields

Field	Type	Description
`eventId`	string	Unique event ID (UUID)
`timestamp`	ISO 8601	When the event occurred
`tenantId`	string	Tenant identifier (may be null)
`eventType`	string	Event type — `GATEWAY_RESPONSE` on every data-plane request; many other types fire for policy decisions, PII / guardrail / MCP enforcement, and admin actions (e.g. `POLICY_DENIED`, `PII_DETECTED`, `AGENT_LOOP_DETECTED`, `TENANT_CREATED`). See SIEM and Webhooks for the full event-type catalog.
`payload`	object	Event-specific data (model, provider, status, latency, etc.)

Storage

Audit events are persisted to PostgreSQL and indexed by tenant, event type, and timestamp. The audit store is append-only — events cannot be updated or deleted through the API.

A2A plane audit (separate store)

The A2A governance plane keeps its hop audit (A2A_HOP_INTENT / A2A_HOP_RESULT / A2A_HOP_DENIED, A2A_DELEGATION_RECORDED, A2A_PII_DETECTED, A2A_LOOP_DETECTED, A2A_CARD_DISCOVERED) on its own tamper-evident HMAC hash chain, in a store separate from the shared audit events above — so the two streams stay independently auditable. These events therefore do not appear in the shared /v1/admin/audit/events query or the shared audit viewer. Browse them at Console → Agents → A2A Audit (cross-tenant, with per-event tamper-evidence) or Portal → Agents → A2A Audit (the tenant's own hops).

They are included in the SIEM export, though — the A2A own-chain events are forwarded to Splunk / CloudWatch / Kafka alongside the shared stream, with the same source/sourcetype, so your security team sees agent-to-agent activity in the same place as everything else.

They also surface in every generated compliance report (SOC 2, HIPAA, GDPR, RBI, SEBI): each report carries an A2A governance section — hop volume, per-hop policy denials, PII actions, and loop detections for the window — plus an A2A chain-integrity check that verifies the separate A2A trail's tamper-evidence the same way it verifies the shared chain. So an auditor sees agent-to-agent activity and its forensic integrity in the same document as the LLM and MCP evidence.

Tamper-evident audit trail

The audit subsystem provides:

HMAC-SHA256 signing and hash-chaining. Every audit event is wrapped in a signed envelope. Each envelope's HMAC includes the previous event's hash, creating a tamper-evident chain that can be verified end-to-end.
Event enrichment. Events are automatically enriched with trace_id from the request context and with actor_user_id / actor_user_name / actor_roles from the authenticated principal.
Prompt storage opt-in. By default, prompt and message content is stripped from audit events. Tenants opt in by setting audit.store-prompts: "true" in tenant metadata. The global default is controlled by dvara.audit.store-prompts-by-default.
SIEM export. Signed envelopes are fanned out to any combination of built-in exporters: a JSON log exporter (always active), Kafka (tenant-keyed partitioning, dead-letter topic, SASL support), Splunk HEC, and AWS CloudWatch Logs. Export failure never blocks audit persistence. See SIEM and Webhooks for the full exporter configuration.
Chain integrity verification. The whole hash chain or individual events can be verified by recomputing HMACs, either through the admin API or from a background integrity sweep.

Enterprise audit configuration:

docker run -d --name dvara-gateway \
  -p 8080:8080 \
  -e DVARA_LICENSE_KEY=<your-license-key> \
  -e DVARA_AUDIT_HMAC_SECRET=your-hmac-secret \
  -e DVARA_AUDIT_MAX_EVENTS=100000 \
  -e DVARA_AUDIT_STORE_PROMPTS=false \
  ghcr.io/dvarahq/dvara/dvara-llm-gateway:1.3.0

DVARA Flightdeck

The audit log is viewable in the DVARA Flightdeck at /audit with live 3-second polling, filtering by tenant / event type / date range, click-to-expand event details, pause and resume, and CSV export.

X-Trace-ID Propagation​

X-Session-Id Header​

Structured JSON Logging​

Configuration​

Log fields​

Access Log Example​

Prometheus Metrics​

Scrape Endpoint​

Available Metrics​

Core Metrics​

Policy & Routing Metrics​

FinOps Metrics​

Guardrail & Security Metrics​

Intelligent Routing & Config Metrics​

Flightdeck-only metrics​

MCP Proxy Metrics​

Configuration​

Grafana Dashboard​

Token Usage Metering​

Query Endpoints​

Record Fields​

Summary Response​

Pre-Built Grafana Dashboards​

One-Command Setup​

Manual Import​

Alerting Rules​

Datadog Integration​

OpenMetrics Scraping​

OTLP Traces to Datadog​

Health Endpoints​

OpenTelemetry Distributed Tracing​

How It Works​

LLM Span Hierarchy​

MCP Span Hierarchy​

LLM Span Names and Attributes​

MCP Span Names and Attributes​

Configuration​

X-Trace-ID Integration​

Viewing Traces​

GenAI semantic-convention attributes (gen_ai.*)​

Sending traces to Arize Phoenix​

Sending traces to Langfuse​

Log Correlation​

Disabling Tracing​

Audit Event Stream​

How it works​

Query Endpoints​

Event Fields​

Storage​

A2A plane audit (separate store)​

Tamper-evident audit trail​

DVARA Flightdeck​

X-Trace-ID Propagation

X-Session-Id Header

Structured JSON Logging

Configuration

Log fields

Access Log Example

Prometheus Metrics

Scrape Endpoint

Available Metrics

Core Metrics

Policy & Routing Metrics

FinOps Metrics

Guardrail & Security Metrics

Intelligent Routing & Config Metrics

Flightdeck-only metrics

MCP Proxy Metrics

Configuration

Grafana Dashboard

Token Usage Metering

Query Endpoints

Record Fields

Summary Response

Pre-Built Grafana Dashboards

One-Command Setup

Manual Import

Alerting Rules

Datadog Integration

OpenMetrics Scraping

OTLP Traces to Datadog

Health Endpoints

OpenTelemetry Distributed Tracing

How It Works

LLM Span Hierarchy

MCP Span Hierarchy

LLM Span Names and Attributes

MCP Span Names and Attributes

Configuration

X-Trace-ID Integration

Viewing Traces

GenAI semantic-convention attributes (`gen_ai.*`)

Sending traces to Arize Phoenix

Sending traces to Langfuse

Log Correlation

Disabling Tracing

Audit Event Stream

How it works

Query Endpoints

Event Fields

Storage

A2A plane audit (separate store)

Tamper-evident audit trail

DVARA Flightdeck