Skip to main content

Structured Outputs

DVARA supports OpenAI-compatible response_format on POST /v1/chat/completions. Clients send a single format specification and the gateway transparently translates it to each provider's native mechanism.

Supported Formats

typeDescription
textDefault. No constraint on the response format.
json_objectInstructs the model to return valid JSON.
json_schemaInstructs the model to return JSON conforming to a specific JSON Schema.

json_object Mode

Request the model to return valid JSON:

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "List 3 colors as a JSON array."}
],
"response_format": {"type": "json_object"}
}'

json_schema Mode

Request the model to return JSON conforming to a specific schema:

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Extract the person name and age from: John is 30 years old."}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "person",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"}
},
"required": ["name", "age"]
},
"strict": true
}
}
}'

Inline schemas vs. server-side validation

DVARA has two independent mechanisms for working with structured outputs. They solve different problems and you can use them together — they are not alternatives:

Inline response_formatServer-side schema validation
What it doesSent to the upstream provider as a generation constraint (constrained decoding on OpenAI / Azure / Mistral, tool-use rewrite on Anthropic / Bedrock, responseSchema on Gemini).Validates the provider's response against a registered JSON schema after the call returns. Independent of how the request was sent.
Where the schema livesIn the request body. The caller decides per-call.Registered with the gateway via the Admin API or the Schemas page in the DVARA Flightdeck. Auto-applied to every request whose model matches the schema's modelPattern glob.
What happens on mismatchProvider's own behaviour — typically the model retries internally or returns a closest-match string.Gateway returns HTTP 422 schema_validation_failed; the response body lists the JSON-schema validation errors.
When it runsBefore the upstream call (request rewrite).After the upstream call (response gate).
Best forSingle call sites where the schema is part of the application code.Centrally-enforced contracts the gateway should police regardless of caller — e.g. ensuring every response from a customer-facing model conforms to a published schema.

The inline response_format example earlier in this page covers the first mechanism. The rest of this section describes the second.

Server-side validation: register a schema once, gateway validates every matching response

# 1. Register the schema. Scope it with `modelPattern` (a glob against the
# request's `model` field) OR `routeId` (bind to a specific configured
# route), or both. At least one is required — the API rejects a schema
# with neither scope with HTTP 400 `invalid_output_schema_scope`.
curl -s -X POST http://localhost:8090/v1/admin/schemas \
-H "Authorization: Bearer <admin-pat>" \
-H "Content-Type: application/json" \
-d '{
"id": "person-v1",
"modelPattern": "gpt-4o*",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"}
},
"required": ["name", "age"]
},
"maxRetries": 2,
"enabled": true
}'

# 2. Make a normal chat completion. No special syntax in the request body.
# The gateway sees `model: gpt-4o`, finds the matching schema, and
# validates the response after the upstream call returns.
curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "John is 30."}]
}'

A response that conforms to the schema returns normally. A response that doesn't returns HTTP 422:

{
"error": {
"code": "schema_validation_failed",
"message": "Output schema validation failed: $.age: integer expected, found string"
}
}
FieldRequiredDescription
idyesStable identifier; referenced in error messages and used as the cache key for the compiled schema.
modelPatternone of modelPattern / routeId is requiredGlob against the request's model (e.g. gpt-4o*, claude-*). The schema applies to every response whose model matches.
routeIdone of modelPattern / routeId is requiredBind the schema to a specific configured route (matched against routes.id). Use this when the policy is "every response coming out of route X must validate against this schema" regardless of which model the request landed on. Combinable with modelPattern for both-must-match scoping.
schemayesA JSON Schema Draft 7 document. Compiled once per schema id and cached.
maxRetriesno (default 2)Reserved for a forthcoming retry-on-mismatch flow; currently the gateway returns 422 immediately on validation failure.
correctionPromptnoReserved for the same retry flow.
enabledno (default true)Soft-disable a schema without deleting it.

Creating a schema with neither routeId nor modelPattern set fails fast with HTTP 400 invalid_output_schema_scope — a dead-config schema can't silently never-fire on the request path.

Registered schemas can be managed from the Schemas page in the DVARA Flightdeck or through the Admin API. Both edit the same registry — pick whichever fits your workflow.

When to use which:

  • Most callers should use inline response_format — the schema lives next to the call site, no extra config to keep in sync.
  • Reach for server-side validation when the policy needs to be enforced regardless of caller (e.g. a published API contract you want the gateway to police), or when you want JSON-schema validation on responses from providers that don't support response_format natively (Ollama, Cohere) — the validation runs after the upstream call, so it works on any provider.

Provider Translation

The gateway translates response_format to each provider's native mechanism. Your application always sends the same OpenAI-format request.

Providerjson_objectjson_schemastrict Support
OpenAINative passthroughNative passthroughNative
Azure OpenAINative passthroughNative passthroughNative
MistralNative passthroughNative passthroughNative
GrokNative passthroughNative passthroughNative
AnthropicSystem prompt injectionTool-use rewrite (structured_output tool)Downgraded
GeminigenerationConfig.responseMimeType set to "application/json"generationConfig.responseMimeType + responseSchemaNative *
BedrockSystem message injectiontoolConfig rewrite (structured_output tool)Downgraded
DeepSeekNative passthroughFiltered out by capability-aware routingN/A
MoonshotNative passthroughFiltered out by capability-aware routingN/A
ChatGLMNative passthroughFiltered out by capability-aware routingN/A
GroqNative passthroughFiltered out by capability-aware routingN/A
QwenFiltered out by capability-aware routingFiltered out by capability-aware routingN/A
CohereFiltered out by capability-aware routingFiltered out by capability-aware routingN/A
OllamaFiltered out by capability-aware routingFiltered out by capability-aware routingN/A
MockWraps response in {"result": …}Wraps response in {"result": …}N/A
Gemini and the strict flag

Gemini's API doesn't accept a strict field directly, but responseSchema always enforces structure when json_schema is used. So although the gateway doesn't forward strict: true to Gemini, the net behaviour is the same as a natively-strict provider — schema conformance is enforced by the upstream.

OpenAI-compatible group (OpenAI, Azure OpenAI, Mistral, Grok) shares the same pass-through code path: every field of the client's response_format — including the name, schema, and strict fields of json_schema — is forwarded verbatim in the upstream request body. Use any of these four without special handling on the client side.

Mechanism-rewrite group (Anthropic, Bedrock) does not speak response_format natively, so the gateway rewrites the request into each provider's own structured-output dialect. For json_object both providers prepend a system-prompt instruction:

Respond with valid JSON only. Do not include any text outside the JSON object.

Anthropic appends this instruction (with a leading blank-line break) to the existing system prompt; Bedrock adds the trimmed text as a fresh entry in the system messages list. Either way, the client never sees the rewrite — the assistant response arrives with a normal content field.

json_schema rewrite on Anthropic and Bedrock — the gateway registers a tool named structured_output with input_schema set to the client's schema, sets tool_choice to force the model to call that specific tool, and extracts the tool-use response block back into a plain content text field. The client still receives a standard chat completion envelope; the tool-use round-trip is internal.

Partial-support group (DeepSeek, Moonshot, ChatGLM, Groq) accept json_object natively but don't support json_schema. A request with response_format: json_schema is filtered out of the candidate pool before any upstream call.

Unsupported providers (Qwen, Cohere, Ollama) are filtered out of the candidate pool for both json_object and json_schema. On a route that contains only unsupported providers, the request fails fast with NO_CAPABLE_PROVIDER (HTTP 400). On a mixed route, a capable provider is selected automatically. The filtering is deterministic — you won't accidentally hit a provider that can't honor the format.

Strict Mode Downgrade

strict: true on a json_schema request asks the provider to use constrained decoding — output is guaranteed to conform to the schema. Not every provider supports it. When the request lands on one that doesn't, the gateway downgrades to best-effort and tells you about it.

ProviderWhat happens with strict: true
OpenAI, Azure OpenAI, Mistral, GrokForwarded verbatim. Constrained decoding is on — the upstream guarantees the shape.
GeminiresponseSchema always enforces structure. Net behaviour matches strict mode (see the note above).
Anthropic, BedrockDowngraded. No native constrained decoding; the gateway uses the structured_output tool-use rewrite, which is best-effort. The model usually obeys the schema, but stretched type coercion or a missing required field is possible.

How to detect a downgrade

When the request lands on Anthropic or Bedrock with strict: true, the response carries:

X-Gateway-Strict-Downgraded: true

The header is added by the upstream provider's response handler and lifted onto the HTTP response by the gateway, so you check it the same way no matter which provider actually handled the call. If the header is absent, the upstream guaranteed strict and no client-side check is needed.

Example: validate client-side when downgraded

import json
import jsonschema
import requests

schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"}
},
"required": ["name", "age"]
}

r = requests.post(
"http://localhost:8080/v1/chat/completions",
headers={"Authorization": "Bearer $DVARA_API_KEY"},
json={
"model": "claude-sonnet-4-5",
"messages": [{"role": "user", "content": "John is 30 years old."}],
"response_format": {
"type": "json_schema",
"json_schema": {"name": "person", "schema": schema, "strict": True},
},
},
)

content = r.json()["choices"][0]["message"]["content"]

# If the gateway downgraded strict, validate the body ourselves so we
# fail loudly instead of letting a malformed response leak downstream.
if r.headers.get("X-Gateway-Strict-Downgraded") == "true":
jsonschema.validate(json.loads(content), schema)

# Otherwise the upstream guaranteed schema conformance — no extra check.
When strict really matters, route around the downgrade

If the response feeds a downstream pipeline that breaks on bad shape (a structured ETL step, a regulated audit record, a contract with a partner), don't rely on the downgraded best-effort path. Pin the route to a natively-strict provider:

- id: contract-strict
model-pattern: "claude-strict"
pinned-model-version: "gpt-4o-2024-08-06"
providers:
- provider: openai

The strict flag then runs on hardware that guarantees it.

Capability-Aware Routing

When response_format is present, the gateway filters the provider pool before routing to ensure only capable providers are considered:

response_format typeRequired capability
json_schemasupportsStructuredOutputs = true
json_objectsupportsJsonMode = true
text / absentNo filtering applied

No Capable Provider

If no provider on the route supports the requested format, the gateway returns HTTP 400:

{
"error": {
"code": "no_capable_provider",
"type": "invalid_request_error",
"message": "No provider supports response_format: json_schema. Providers on route: [ollama]. Capable providers: []",
"trace_id": "a1b2c3d4e5f6789012345678abcdef01"
}
}

Failover Blocked

If the primary provider fails and no capable fallback exists, the gateway returns HTTP 503 with an X-Gateway-Failover-Blocked: capability_mismatch header:

{
"error": {
"code": "failover_capability_mismatch",
"type": "provider_unavailable",
"message": "Failover blocked: no fallback provider supports response_format: json_schema. Primary provider [openai] failed: HTTP 500 from upstream",
"trace_id": "a1b2c3d4e5f6789012345678abcdef01"
}
}

The text after failed: echoes the primary provider's own error message, so it varies per outage ("HTTP 500 from upstream", "Connection timeout after 30s", "Circuit breaker open", etc.). Don't grep for an exact string; check the code field for routing logic.

This header lets clients distinguish a normal provider outage (retry later) from a capability gap (reconfigure the route or drop the response_format requirement).

Error envelope fields

Every error response carries the same four fields under error:

  • code — machine-readable identifier (lowercase snake_case). Stable across releases; the field to switch on for routing logic.
  • type — broad category (invalid_request_error, provider_unavailable, policy_violation, etc.). Useful when grouping errors at the call-site.
  • message — human-readable description. Free-form, not stable across releases.
  • trace_id — request trace id (hex). Echoes the X-Trace-ID response header. Use it to find the request in logs.

A param field also appears on validation errors that point at a specific request field; it's omitted otherwise.

Order of operations

A request that uses both inline response_format and a registered server-side schema runs through these stages, in order:

  1. Capability filter — providers that can't honor the requested response_format are dropped from the candidate pool. If the pool ends up empty, the request fails fast with no_capable_provider (HTTP 400).
  2. Provider call — the gateway rewrites response_format per the translation table and dispatches to the chosen provider.
  3. Failover — on provider_error or provider_circuit_open, an alternative on the same route is tried. The capability filter and any data-residency constraints re-apply; an incapable fallback is never picked. If no capable healthy fallback exists, the gateway returns failover_capability_mismatch (HTTP 503) with an X-Gateway-Failover-Blocked: capability_mismatch header.
  4. Server-side schema validation — if a registered schema's modelPattern matches the request's model, the response body is parsed as JSON and validated against the schema. A mismatch returns schema_validation_failed (HTTP 422) — the upstream call is not retried (see maxRetries note above).
  5. Response delivered — including any X-Gateway-Strict-Downgraded: true header lifted from the provider response.

Stages 1–3 fire only when response_format is on the request. Stage 4 fires only when a registered schema matches the model. Stage 5 always runs.

Validation Errors

ConditionHTTPError Code
Missing response_format.type400invalid_request
Unknown type (e.g., xml)400invalid_request
json_schema without schema400invalid_request
Unsupported provider (e.g., Ollama)400unsupported_response_format
No capable provider on route400no_capable_provider
Failover blocked by capability503failover_capability_mismatch
Registered schema mismatched the response body422schema_validation_failed
Schema registration with neither routeId nor modelPattern scope400invalid_output_schema_scope