Skip to main content

DVARA headers — request and response reference

DVARA is an AI governance platform, so every /v1/* request is already audited with a server-generated trace ID. The headers below let you authenticate, correlate the audit trail with your own request identifiers, and read the signals DVARA emits back on the response.

Request headers

Authorization: Bearer gw_…

Required on every data-plane request. The API key is minted in the DVARA Flightdeck (/portal/keys for tenants, /tenants/{id}/keys for operators) and starts with gw_. The full secret is shown exactly once at creation time — only a SHA-256 hash is persisted server-side. A missing or invalid key in strict mode (dvara.llm-gateway.data-plane.require-api-key=true) returns 401 invalid_api_key. See Data plane authentication for rotation and revoke flows.

X-Trace-ID

Pass X-Trace-ID for distributed tracing correlation. DVARA resolves the trace ID in this order: (1) client-supplied X-Trace-ID / X-Trace-Id header, (2) the OTel trace ID from the current span when OpenTelemetry tracing is active, (3) a random UUID. The chosen value lands on the audit event, the SLF4J MDC, and is echoed back as the X-Trace-ID response header.

# Python (OpenAI SDK)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
extra_headers={"X-Trace-ID": "my-trace-123"},
)
// TypeScript (OpenAI SDK)
const response = await client.chat.completions.create(
{
model: "gpt-4o",
messages: [{ role: "user", content: "Hello" }],
},
{ headers: { "X-Trace-ID": "my-trace-123" } },
);
// Java (LangChain4j)
OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.customHeaders(Map.of("X-Trace-ID", "my-trace-123"))
.build();

The same trace ID will appear on:

  • The GATEWAY_REQUEST and GATEWAY_RESPONSE audit events
  • The X-Trace-ID response header DVARA sends back to you
  • The OTLP span emitted for the request (when tracing is enabled)

X-Session-Id

Pass X-Session-Id for agent session tracking. DVARA groups requests sharing a session ID into a single agentic session, which activates loop detection, approval gates, and session-level audit grouping.

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Continue our conversation"}],
extra_headers={"X-Session-Id": "agent-session-456"},
)
const response = await client.chat.completions.create(
{
model: "gpt-4o",
messages: [{ role: "user", content: "Continue" }],
},
{ headers: { "X-Session-Id": "agent-session-456" } },
);
OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.customHeaders(Map.of("X-Session-Id", "agent-session-456"))
.build();

Sessions appear in the DVARA Flightdeck under Agents → Sessions, where you can see loop-detection status, kill a runaway session, or drill into every request the session produced.

Response headers

DVARA stamps response headers to signal in-band conditions that don't justify a non-2xx but a well-behaved client should react to. None of these break the OpenAI wire format; they layer on top.

HeaderWhen emittedWhat it means
X-Trace-IDAlwaysThe trace ID DVARA resolved for the request — echo it in your application logs to grep across the audit, tracing, and provider-log layers.
X-Gateway-Strict-Downgraded: trueresponse_format: json_schema with strict: true against Anthropic or AWS BedrockThe underlying provider cannot enforce strict JSON Schema. DVARA still issues the tool-use rewrite, but a non-conforming response will not be rejected upstream — validate on the client if strictness is load-bearing.
X-Gateway-Failover-Blocked: capability_mismatch503 failover_capability_mismatchThe primary provider failed and no fallback on the route supports the requested capability (e.g. all fallbacks are vision-blind for an image request). Add a capable fallback to the route or change the request.
X-License-WarningLicense is EXPIRING_SOON (≤30 days) or GRACE_PERIOD (expired, within 14-day grace)The gateway is still serving traffic but the operator should renew. Surface this to SRE alerting so it doesn't fall to a degraded state.
X-Budget-Warning, X-Budget-Utilization-PctSoft budget threshold crossedSpend has crossed the configured soft-limit-pct. The request still succeeds; the headers let dashboards and clients show "approaching budget" notices ahead of the eventual 402 budget_cap_hard.
X-Context-Window-Warning, X-Context-Window-UtilizationApproaching the model's context windowThe estimated input tokens are over the per-tenant guardrail.context.warning-threshold-pct (default 70%). At the hard threshold a 400 context_window_exceeded is returned.
Retry-After, X-RateLimit-Retry-After-Seconds, X-RateLimit-Reset, X-RateLimit-Remaining429 rate_limit_exceeded or 429 priority_throttledStandard rate-limit backoff envelope. Honor Retry-After (seconds) to avoid hammering the rate-limit window.

The HTTP error envelope ({"error": {"type": "...", "code": "...", "message": "...", ...}}) is documented separately in Error handling.

Why this matters for governance

Trace IDs and session IDs are the join key between:

  1. Your application's structured logs
  2. Your tracing backend (Jaeger, Tempo, Honeycomb)
  3. DVARA's immutable audit trail
  4. The upstream provider's request logs (via propagation)

Without them, when a policy fires a denial at 3 a.m. you know what was blocked but not which customer request triggered it. With them, one grep on the trace ID reconnects every layer of the stack.

Next steps