Skip to main content
Version: Latest (1.0.x)

DVARA headers — request and response reference

DVARA is an AI governance platform, so every /v1/* request is already audited with a server-generated trace ID. The headers below let you authenticate, correlate the audit trail with your own request identifiers, and read the signals DVARA emits back on the response.

Request headers

Authorization: Bearer gw_…

Required on every data-plane request. The API key is minted in the DVARA Flightdeck (/portal/keys for tenants, /tenants/{id}/keys for operators) and starts with gw_. The full secret is shown exactly once at creation time — only a SHA-256 hash is persisted server-side. A missing or invalid key in strict mode (dvara.llm-gateway.data-plane.require-api-key=true) returns 401 invalid_api_key. See Data plane authentication for rotation and revoke flows.

X-Trace-ID

Pass X-Trace-ID for distributed tracing correlation. DVARA resolves the trace ID in this order: (1) client-supplied X-Trace-ID / X-Trace-Id header, (2) the OTel trace ID from the current span when OpenTelemetry tracing is active, (3) a random UUID. The chosen value lands on the audit event, the SLF4J MDC, and is echoed back as the X-Trace-ID response header.

# Python (OpenAI SDK)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
extra_headers={"X-Trace-ID": "my-trace-123"},
)
// TypeScript (OpenAI SDK)
const response = await client.chat.completions.create(
{
model: "gpt-4o",
messages: [{ role: "user", content: "Hello" }],
},
{ headers: { "X-Trace-ID": "my-trace-123" } },
);
// Java (LangChain4j)
OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.customHeaders(Map.of("X-Trace-ID", "my-trace-123"))
.build();

The same trace ID will appear on:

  • The GATEWAY_RESPONSE audit event (one per request through the data plane), and any other event types fired downstream by policy, PII, guardrail, MCP, or admin paths — trace_id lives on the access log line surrounding the event, so a single grep on the trace ID reconnects every layer
  • The X-Trace-ID response header DVARA sends back to you
  • The OTLP span emitted for the request (when tracing is enabled)

X-Session-Id

Pass X-Session-Id for agent session tracking. DVARA groups requests sharing a session ID into a single agentic session, which activates loop detection, approval gates, and session-level audit grouping.

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Continue our conversation"}],
extra_headers={"X-Session-Id": "agent-session-456"},
)
const response = await client.chat.completions.create(
{
model: "gpt-4o",
messages: [{ role: "user", content: "Continue" }],
},
{ headers: { "X-Session-Id": "agent-session-456" } },
);
OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.customHeaders(Map.of("X-Session-Id", "agent-session-456"))
.build();

Sessions appear in the DVARA Flightdeck under Agents → Sessions, where you can see loop-detection status, kill a runaway session, or drill into every request the session produced.

Response headers

DVARA stamps response headers to signal in-band conditions that don't justify a non-2xx but a well-behaved client should react to. None of these break the OpenAI wire format; they layer on top.

HeaderWhen emittedWhat it means
X-Trace-IDAlwaysThe trace ID DVARA resolved for the request — echo it in your application logs to grep across the audit, tracing, and provider-log layers.
X-Gateway-Strict-Downgraded: trueresponse_format: json_schema with strict: true against Anthropic or AWS BedrockThe underlying provider cannot enforce strict JSON Schema. DVARA still issues the tool-use rewrite, but a non-conforming response will not be rejected upstream — validate on the client if strictness is load-bearing.
X-Gateway-Failover-Blocked: capability_mismatch503 failover_capability_mismatchThe primary provider failed and no fallback on the route supports the requested capability (e.g. all fallbacks are vision-blind for an image request). Add a capable fallback to the route or change the request.
X-License-WarningLicense is EXPIRING_SOON (≤30 days) or GRACE_PERIOD (expired, within 14-day grace)The gateway is still serving traffic but the operator should renew. Surface this to SRE alerting so it doesn't fall to a degraded state.
X-Budget-Warning, X-Budget-Remaining-Pct, X-Budget-Remaining-TokensSoft budget threshold crossedSpend has crossed the configured soft-limit-pct. The request still succeeds; the headers let dashboards and clients show "approaching budget" notices ahead of the eventual 402 budget_cap_hard. X-Budget-Warning: true flags the crossing; the two Remaining-* headers carry the live numbers a client UI can render.
X-Context-Window-Warning, X-Context-Window-UtilizationApproaching the model's context windowThe estimated input tokens are over the per-tenant guardrail.context.warning-threshold-pct (default 70%). At the hard threshold a 400 context_window_exceeded is returned.
Retry-After, X-RateLimit-Retry-After-Seconds429 rate_limit_exceeded or 429 priority_throttledStandard rate-limit backoff envelope on every 429 — Retry-After is the canonical HTTP header, X-RateLimit-Retry-After-Seconds is the same value in seconds-only for SDKs that don't parse Retry-After. Honor either to avoid hammering the window.
X-RateLimit-Tokens-Limit, X-RateLimit-Tokens-Remaining, X-RateLimit-Reset429 rate_limit_exceeded when the token-budget branch trippedAdditional headers on the token-budget reject path (not the request-count reject path). The limit + remaining tell the client the per-key token budget and how close it was; X-RateLimit-Reset is the ISO-8601 instant the 60-second sliding window resets. Distinguish a token-budget 429 from a request-count 429 by checking these headers' presence, or by reading error.rate_limit.limited_resource in the body.

The HTTP error envelope ({"error": {"type": "...", "code": "...", "message": "...", ...}}) is documented separately in Error handling.

Why this matters for governance

Trace IDs and session IDs are the join key between:

  1. Your application's structured logs
  2. Your tracing backend (Jaeger, Tempo, Honeycomb)
  3. DVARA's immutable audit trail
  4. The upstream provider's request logs (via propagation)

Without them, when a policy fires a denial at 3 a.m. you know what was blocked but not which customer request triggered it. With them, one grep on the trace ID reconnects every layer of the stack.

Next steps