Skip to main content

Authentication

Every /v1/* call on the Data Plane requires a tenant API key in the Authorization header. The key is a Bearer token with the gw_ prefix:

Authorization: Bearer gw_<your-key>

The gateway resolves the key to a tenant server-side on every request — you never pass a tenant ID directly. All downstream pipeline stages (rate limits, policies, PII enforcement, budget caps, audit) apply under that tenant's configuration.

The endpoint you can hit without an API key is GET /actuator/health — point liveness probes and uptime monitors at it. The previous bare GET /status shim was removed; it duplicated /actuator/gateway-status, which is now authenticated with a separate shared-secret Bearer (DVARA_ACTUATOR_API_KEY) and intended for operator tooling, not for client SDKs.

Create an API key

From the DVARA Flightdeck, open Tenants → your tenant → API keys → New key, give the key a human-readable name and a scope, and copy the plaintext gw_... value immediately — the full secret is only shown once at creation time. Only a SHA-256 hash is stored in the database, so a lost key cannot be recovered, only replaced.

From the Automation API, hit the DVARA Flightdeck on port 8090:

curl -s http://localhost:8090/v1/admin/tenants/acme/keys \
-H "Authorization: Bearer dvara_pat_<your-pat>" \
-H "Content-Type: application/json" \
-d '{
"name": "production-backend",
"scopes": ["completions:write"]
}'

The tenant id goes in the path (/v1/admin/tenants/{tenantId}/keys), not the body. The response returns the plaintext key field exactly once:

{
"id": "7c4e6d2a-1b3f-4a8c-9d10-abcdef123456",
"name": "production-backend",
"key_prefix": "gw_live_abc1",
"key": "gw_live_abc1...full-secret-hidden-in-docs...xyz",
"scopes": ["completions:write"],
"status": "ACTIVE",
"created_at": "2026-04-14T10:00:00Z"
}

Write the plaintext into your secret store — a vault, a Kubernetes Secret, an SSM parameter, or an environment variable on the calling service. Every subsequent GET on the same key returns only key_prefix (the first ~12 characters) — the full value is gone.

Scopes — labels, not enforcement (today)

API keys carry a scopes array (completions:write is the default applied at creation) that is recorded in audit events but is not enforced at the request path on the 1.0.0-GA data plane. Any active key on a tenant can call any /v1/* endpoint that the tenant is provisioned for; the scope value is a label your operators can use to filter the audit log, not a security boundary.

Treat the scopes field as a forward-compatible hook: setting it correctly today (e.g. ["completions:write"] on app-server keys, ["embeddings:write"] on indexing-pipeline keys) means scope-aware enforcement, when it ships, will not require a key rotation. Until then, scope down by issuing separate keys per workload so revocation gives you the same blast-radius control.

Rotate a key

Rotation is always a create-then-revoke flow, never an in-place update. The new key's secret is different from the old one, so the calling service has to read the new plaintext once before you revoke the old key.

  1. Create a new key with the same tenant_id, the same scopes, and a new name (production-backend-2). Copy the plaintext.
  2. Roll the calling service forward onto the new key — update secrets, restart the service, verify traffic is landing under the new key_prefix in the access log or metrics.
  3. Revoke the old key from the DVARA Flightdeck or via DELETE /v1/admin/tenants/{tenantId}/keys/{keyId} (returns 204). The key's status flips to REVOKED and the cached api-keys-hash entry across every gateway node is evicted within seconds, so the next request on the old key returns 401.

A key that stays in REVOKED state for longer than the configured retention window is hard-deleted by the background cleanup job; the id remains in audit events but the hash is gone.

Handling authentication errors

Error codeHTTPMeaningHow to fix
UNAUTHORIZED401Missing, malformed, or revoked Authorization headerVerify the calling service is reading the current secret and the gw_ prefix is intact
RATE_LIMIT_EXCEEDED429The per-key rate limit has been hitCheck X-RateLimit-Reset response header; back off, then retry

The error body follows the same {"error": {...}} envelope used by the rest of the data plane, so you can surface the code in your logs without parsing HTML error pages.

Personal access tokens for automation

The Automation API on port 8090 uses a different authentication scheme: personal access tokens (PATs) issued per user, with the dvara_pat_ prefix. PATs are not valid on the data plane and data plane gw_ keys are not valid on the Automation API. If you see 401 on /v1/admin/* while the same token works on /v1/chat/completions, you are probably pointing a gw_ key at the DVARA Flightdeck or vice versa.