Version: 1.3.0

DVARA headers — request and response reference

DVARA is an AI governance platform, so every /v1/* request is already audited with a server-generated trace ID. The headers below let you authenticate, correlate the audit trail with your own request identifiers, and read the signals DVARA emits back on the response.

Request headers

`Authorization: Bearer gw_…`

Required on every data-plane request. The API key is minted in the DVARA Flightdeck (/portal/keys for tenants, /tenants/{id}/keys for operators) and starts with gw_. The full secret is shown exactly once at creation time — only a SHA-256 hash is persisted server-side. A missing or invalid key in strict mode (dvara.llm-gateway.data-plane.require-api-key=true) returns 401 invalid_api_key. See Data plane authentication for rotation and revoke flows.

`X-Trace-ID`

Pass X-Trace-ID for distributed tracing correlation. DVARA resolves the trace ID in this order: (1) client-supplied X-Trace-ID / X-Trace-Id header, (2) the OTel trace ID from the current span when OpenTelemetry tracing is active, (3) a random UUID. The chosen value lands on the audit event, the SLF4J MDC, and is echoed back as the X-Trace-ID response header.

# Python (OpenAI SDK)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    extra_headers={"X-Trace-ID": "my-trace-123"},
)

// TypeScript (OpenAI SDK)
const response = await client.chat.completions.create(
  {
    model: "gpt-4o",
    messages: [{ role: "user", content: "Hello" }],
  },
  { headers: { "X-Trace-ID": "my-trace-123" } },
);

// Java (LangChain4j)
OpenAiChatModel model = OpenAiChatModel.builder()
        .baseUrl("http://localhost:8080/v1")
        .apiKey("your-dvara-api-key")
        .modelName("gpt-4o")
        .customHeaders(Map.of("X-Trace-ID", "my-trace-123"))
        .build();

The same trace ID will appear on:

The GATEWAY_RESPONSE audit event (one per request through the data plane), and any other event types fired downstream by policy, PII, guardrail, MCP, or admin paths — trace_id lives on the access log line surrounding the event, so a single grep on the trace ID reconnects every layer
The X-Trace-ID response header DVARA sends back to you
The OTLP span emitted for the request (when tracing is enabled)

`X-Session-Id`

Pass X-Session-Id for agent session tracking. DVARA groups requests sharing a session ID into a single agentic session, which activates loop detection, approval gates, and session-level audit grouping.

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Continue our conversation"}],
    extra_headers={"X-Session-Id": "agent-session-456"},
)

const response = await client.chat.completions.create(
  {
    model: "gpt-4o",
    messages: [{ role: "user", content: "Continue" }],
  },
  { headers: { "X-Session-Id": "agent-session-456" } },
);

OpenAiChatModel model = OpenAiChatModel.builder()
        .baseUrl("http://localhost:8080/v1")
        .apiKey("your-dvara-api-key")
        .modelName("gpt-4o")
        .customHeaders(Map.of("X-Session-Id", "agent-session-456"))
        .build();

Sessions appear in the DVARA Flightdeck under Agents → Sessions, where you can see loop-detection status, kill a runaway session, or drill into every request the session produced.

Response headers

DVARA stamps response headers to signal in-band conditions that don't justify a non-2xx but a well-behaved client should react to. None of these break the OpenAI wire format; they layer on top.

Header	When emitted	What it means
`X-Trace-ID`	Always	The trace ID DVARA resolved for the request — echo it in your application logs to grep across the audit, tracing, and provider-log layers.
`X-Gateway-Strict-Downgraded: true`	`response_format: json_schema` with `strict: true` against Anthropic or AWS Bedrock	The underlying provider cannot enforce strict JSON Schema. DVARA still issues the tool-use rewrite, but a non-conforming response will not be rejected upstream — validate on the client if strictness is load-bearing.
`X-Gateway-Failover-Blocked: capability_mismatch`	`503 failover_capability_mismatch`	The primary provider failed and no fallback on the route supports the requested capability (e.g. all fallbacks are vision-blind for an image request). Add a capable fallback to the route or change the request.
`X-License-Warning`	License is `EXPIRING_SOON` (≤30 days) or `GRACE_PERIOD` (expired, within 14-day grace)	The gateway is still serving traffic but the operator should renew. Surface this to SRE alerting so it doesn't fall to a degraded state.
`X-Budget-Warning`, `X-Budget-Remaining-Pct`, `X-Budget-Remaining-Tokens`	Soft budget threshold crossed	Spend has crossed the configured `soft-limit-pct`. The request still succeeds; the headers let dashboards and clients show "approaching budget" notices ahead of the eventual `402 budget_cap_hard`. `X-Budget-Warning: true` flags the crossing; the two `Remaining-*` headers carry the live numbers a client UI can render.
`X-Context-Window-Warning`, `X-Context-Window-Utilization`	Approaching the model's context window	The estimated input tokens are over the per-tenant `guardrail.context.warning-threshold-pct` (default 70%). At the hard threshold a `400 context_window_exceeded` is returned.
`Retry-After`, `X-RateLimit-Retry-After-Seconds`	`429 rate_limit_exceeded` or `429 priority_throttled`	Standard rate-limit backoff envelope on every 429 — `Retry-After` is the canonical HTTP header, `X-RateLimit-Retry-After-Seconds` is the same value in seconds-only for SDKs that don't parse `Retry-After`. Honor either to avoid hammering the window.
`X-RateLimit-Tokens-Limit`, `X-RateLimit-Tokens-Remaining`, `X-RateLimit-Reset`	`429 rate_limit_exceeded` when the token-budget branch tripped	Additional headers on the token-budget reject path (not the request-count reject path). The limit + remaining tell the client the per-key token budget and how close it was; `X-RateLimit-Reset` is the ISO-8601 instant the 60-second sliding window resets. Distinguish a token-budget 429 from a request-count 429 by checking these headers' presence, or by reading `error.rate_limit.limited_resource` in the body.

The HTTP error envelope ({"error": {"type": "...", "code": "...", "message": "...", ...}}) is documented separately in Error handling.

Why this matters for governance

Trace IDs and session IDs are the join key between:

Your application's structured logs
Your tracing backend (Jaeger, Tempo, Honeycomb)
DVARA's immutable audit trail
The upstream provider's request logs (via propagation)

Without them, when a policy fires a denial at 3 a.m. you know what was blocked but not which customer request triggered it. With them, one grep on the trace ID reconnects every layer of the stack.

Next steps

Back to language-agnostic integration overview.
Looking for examples by language? See Python, JavaScript / TypeScript, or Java.
Debugging a specific response code? See Troubleshooting.

Request headers​

Authorization: Bearer gw_…​

X-Trace-ID​

X-Session-Id​

Response headers​

Why this matters for governance​

Next steps​