SDK & Framework Integrations
Dvara is OpenAI API-compatible — any SDK or framework that supports the OpenAI chat completions API works with Dvara by simply changing the base URL. No custom SDK required.
┌─────────────────────────────┐
│ Your Existing Code │
│ (OpenAI SDK, LangChain, │
│ Spring AI, etc.) │
│ │
│ base_url = "http:// │
│ dvara:8080/v1" │
└──────────┬──────────────────┘
│
▼
┌─────────────────────────────┐
│ Dvara AI Gateway │
│ (routing, policy, audit, │
│ caching, rate limiting) │
└──────────┬──────────────────┘
│
┌─────┼──────┐
▼ ▼ ▼
OpenAI Claude Gemini ...
Quick Reference
| SDK / Framework | Language | Config Change |
|---|---|---|
| OpenAI SDK | Python | base_url |
| OpenAI SDK | JavaScript/TS | baseURL |
| Anthropic SDK | Python | base_url + model prefix |
| Anthropic SDK | JavaScript/TS | baseURL + model prefix |
| AWS Bedrock SDK | Python | Switch to OpenAI SDK |
| Google GenAI SDK | Python | Switch to OpenAI SDK |
| LiteLLM | Python | api_base |
| LangChain | Python | openai_api_base |
| Pydantic AI | Python | base_url on OpenAI provider |
| LangChain4j | Java | baseUrl |
| Spring AI | Java | spring.ai.openai.base-url |
| Chinese Model SDKs | Python | base_url |
Prerequisites
All examples assume:
- Dvara is running at
http://localhost:8080 - A tenant and API key have been created (see Quickstart)
- The target provider is configured in Dvara (e.g.,
OPENAI_API_KEYset)
Replace your-dvara-api-key with your actual Dvara API key throughout.
Python SDKs
OpenAI Python
The most common integration. Works with any model routed through Dvara.
Install:
pip install openai
Basic chat completion:
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in one paragraph."},
],
max_tokens=256,
)
print(response.choices[0].message.content)
Streaming:
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a haiku about APIs."}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Structured output (JSON schema):
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "List 3 planets with their diameter in km."}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "planets",
"strict": True,
"schema": {
"type": "object",
"properties": {
"planets": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"diameter_km": {"type": "integer"},
},
"required": ["name", "diameter_km"],
},
}
},
"required": ["planets"],
},
},
},
)
Tool calls / function calling:
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
},
"required": ["city"],
},
},
}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=tools,
tool_choice="auto",
)
# Check if the model wants to call a function
if response.choices[0].message.tool_calls:
tool_call = response.choices[0].message.tool_calls[0]
print(f"Function: {tool_call.function.name}")
print(f"Arguments: {tool_call.function.arguments}")
Embeddings:
response = client.embeddings.create(
model="text-embedding-3-small",
input="The quick brown fox jumps over the lazy dog.",
)
print(f"Dimensions: {len(response.data[0].embedding)}")
Using Claude via Dvara (same SDK):
# No SDK change — just switch the model name
response = client.chat.completions.create(
model="claude-sonnet-4-20250514",
messages=[{"role": "user", "content": "Hello from Anthropic via Dvara!"}],
)
Using Gemini via Dvara (same SDK):
response = client.chat.completions.create(
model="gemini-2.0-flash",
messages=[{"role": "user", "content": "Hello from Google via Dvara!"}],
)
Async client:
import asyncio
from openai import AsyncOpenAI
async def main():
client = AsyncOpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello async!"}],
)
print(response.choices[0].message.content)
asyncio.run(main())
Environment variable approach:
export OPENAI_BASE_URL="http://localhost:8080/v1"
export OPENAI_API_KEY="your-dvara-api-key"
# No base_url needed — SDK reads from environment
client = OpenAI()
Anthropic Python
If you're already using the Anthropic SDK and want to route through Dvara, point it at Dvara's base URL. Dvara translates the OpenAI-compatible request to the Anthropic API format internally.
Use the OpenAI SDK with model="claude-*" for the simplest integration. Use the Anthropic SDK approach only if you have existing Anthropic SDK code you don't want to change.
Install:
pip install anthropic
Using OpenAI SDK (recommended):
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
response = client.chat.completions.create(
model="claude-sonnet-4-20250514",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain the Rust ownership model."},
],
max_tokens=512,
)
AWS Bedrock
Instead of using boto3 with Bedrock's native API, use the OpenAI SDK pointed at Dvara. Dvara handles SigV4 signing and Bedrock API translation internally.
Before (boto3):
# ❌ Complex — requires AWS credentials, region config, Bedrock-specific API
import boto3
client = boto3.client("bedrock-runtime", region_name="us-east-1")
response = client.converse(
modelId="anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{"role": "user", "content": [{"text": "Hello"}]}],
)
After (via Dvara):
# ✅ Simple — standard OpenAI SDK, Dvara handles Bedrock translation
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
response = client.chat.completions.create(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{"role": "user", "content": "Hello from Bedrock via Dvara!"}],
)
Google GenAI
Use the OpenAI SDK pointed at Dvara instead of the Google GenAI SDK. Dvara translates to the Gemini API format internally.
Before (google-genai):
# ❌ Google-specific API
from google import genai
client = genai.Client(api_key="...")
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="Explain how AI works",
)
After (via Dvara):
# ✅ Standard OpenAI SDK
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
response = client.chat.completions.create(
model="gemini-2.0-flash",
messages=[{"role": "user", "content": "Explain how AI works"}],
)
LiteLLM
LiteLLM already supports custom API bases. Point it at Dvara to centralize all provider calls.
Install:
pip install litellm
Basic usage:
import litellm
response = litellm.completion(
model="openai/gpt-4o", # provider/model format
api_base="http://localhost:8080/v1",
api_key="your-dvara-api-key",
messages=[{"role": "user", "content": "Hello via LiteLLM!"}],
)
print(response.choices[0].message.content)
Using different models through Dvara:
# All requests go through Dvara regardless of model
for model in ["openai/gpt-4o", "openai/claude-sonnet-4-20250514", "openai/gemini-2.0-flash"]:
response = litellm.completion(
model=model,
api_base="http://localhost:8080/v1",
api_key="your-dvara-api-key",
messages=[{"role": "user", "content": "Hello!"}],
)
print(f"{model}: {response.choices[0].message.content[:50]}")
Environment variable approach:
export OPENAI_API_BASE="http://localhost:8080/v1"
export OPENAI_API_KEY="your-dvara-api-key"
LangChain
LangChain's ChatOpenAI class accepts a custom base URL.
Install:
pip install langchain-openai
Basic chat:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
temperature=0.7,
)
response = llm.invoke("What is the capital of France?")
print(response.content)
With chains:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a {role}. Answer concisely."),
("human", "{question}"),
])
chain = prompt | llm
response = chain.invoke({"role": "historian", "question": "Who built the pyramids?"})
print(response.content)
Streaming with chains:
for chunk in chain.stream({"role": "poet", "question": "Write about the moon"}):
print(chunk.content, end="")
With structured output:
from pydantic import BaseModel
class MovieReview(BaseModel):
title: str
rating: float
summary: str
structured_llm = llm.with_structured_output(MovieReview)
review = structured_llm.invoke("Review the movie Inception")
print(f"{review.title}: {review.rating}/10 - {review.summary}")
RAG example:
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)
embeddings = OpenAIEmbeddings(
model="text-embedding-3-small",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)
# Use embeddings with your vector store of choice
# e.g., Chroma, Pinecone, Weaviate, etc.
Agent with tools:
from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool
@tool
def calculator(expression: str) -> str:
"""Evaluate a math expression."""
return str(eval(expression))
llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
])
agent = create_tool_calling_agent(llm, [calculator], prompt)
executor = AgentExecutor(agent=agent, tools=[calculator])
result = executor.invoke({"input": "What is 42 * 17 + 3?"})
print(result["output"])
Pydantic AI
Pydantic AI supports custom OpenAI-compatible endpoints.
Install:
pip install pydantic-ai
Basic usage:
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel(
"gpt-4o",
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
agent = Agent(model)
result = agent.run_sync("What is the capital of Japan?")
print(result.data)
With structured output:
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
class CityInfo(BaseModel):
name: str
country: str
population: int
famous_for: list[str]
model = OpenAIModel(
"gpt-4o",
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
agent = Agent(model, result_type=CityInfo)
result = agent.run_sync("Tell me about Tokyo")
city = result.data
print(f"{city.name}, {city.country} — pop: {city.population:,}")
print(f"Famous for: {', '.join(city.famous_for)}")
With tools:
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel(
"gpt-4o",
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
agent = Agent(model, system_prompt="You help with math.")
@agent.tool_plain
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
result = agent.run_sync("What is 123 times 456?")
print(result.data)
JavaScript / TypeScript SDKs
OpenAI JavaScript
Install:
npm install openai
Basic chat completion:
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "your-dvara-api-key",
});
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain microservices in one paragraph." },
],
});
console.log(response.choices[0].message.content);
Streaming:
const stream = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Write a poem about code." }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
Structured output:
import { zodResponseFormat } from "openai/helpers/zod";
import { z } from "zod";
const Planet = z.object({
name: z.string(),
diameter_km: z.number(),
has_rings: z.boolean(),
});
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Tell me about Saturn." }],
response_format: zodResponseFormat(Planet, "planet"),
});
const planet = JSON.parse(response.choices[0].message.content!);
console.log(`${planet.name}: ${planet.diameter_km} km, rings: ${planet.has_rings}`);
Tool calls:
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "What's the weather in London?" }],
tools: [
{
type: "function",
function: {
name: "get_weather",
description: "Get weather for a city",
parameters: {
type: "object",
properties: { city: { type: "string" } },
required: ["city"],
},
},
},
],
});
const toolCall = response.choices[0].message.tool_calls?.[0];
if (toolCall) {
console.log(`Call: ${toolCall.function.name}(${toolCall.function.arguments})`);
}
Environment variable approach:
export OPENAI_BASE_URL="http://localhost:8080/v1"
export OPENAI_API_KEY="your-dvara-api-key"
Anthropic JavaScript
Same approach as Python — use the OpenAI SDK with Claude model names.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "your-dvara-api-key",
});
const response = await client.chat.completions.create({
model: "claude-sonnet-4-20250514",
messages: [{ role: "user", content: "Hello from TypeScript!" }],
});
Vercel AI SDK
Install:
npm install ai @ai-sdk/openai
Usage:
import { createOpenAI } from "@ai-sdk/openai";
import { generateText, streamText } from "ai";
const dvara = createOpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "your-dvara-api-key",
});
// Non-streaming
const { text } = await generateText({
model: dvara("gpt-4o"),
prompt: "Explain REST APIs briefly.",
});
// Streaming
const result = streamText({
model: dvara("gpt-4o"),
prompt: "Write a short story about a robot.",
});
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}
Java SDKs
LangChain4j
Maven dependency:
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
<version>1.0.0-beta1</version>
</dependency>
Basic chat:
import dev.langchain4j.model.openai.OpenAiChatModel;
OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.build();
String response = model.chat("What is the capital of Germany?");
System.out.println(response);
Streaming:
import dev.langchain4j.model.openai.OpenAiStreamingChatModel;
import dev.langchain4j.model.chat.response.StreamingChatResponseHandler;
OpenAiStreamingChatModel model = OpenAiStreamingChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.build();
model.chat("Write a haiku about Java.", new StreamingChatResponseHandler() {
@Override
public void onPartialResponse(String partialResponse) {
System.out.print(partialResponse);
}
@Override
public void onCompleteResponse(String completeResponse) {
System.out.println("\n--- Done ---");
}
@Override
public void onError(Throwable error) {
error.printStackTrace();
}
});
With AI Services (structured output):
import dev.langchain4j.service.AiServices;
interface MovieExpert {
@dev.langchain4j.service.UserMessage("Review the movie: {{movie}}")
MovieReview review(@dev.langchain4j.service.V("movie") String movie);
}
record MovieReview(String title, double rating, String summary) {}
OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.responseFormat("json")
.build();
MovieExpert expert = AiServices.create(MovieExpert.class, model);
MovieReview review = expert.review("Inception");
System.out.printf("%s: %.1f/10 - %s%n", review.title(), review.rating(), review.summary());
Using Claude via Dvara:
OpenAiChatModel claude = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("claude-sonnet-4-20250514")
.build();
String response = claude.chat("Hello from LangChain4j via Dvara!");
Spring AI
Spring AI has native OpenAI support with configurable base URL.
Maven dependency:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
application.yml:
spring:
ai:
openai:
base-url: http://localhost:8080/v1
api-key: ${MERIDIAN_API_KEY:your-dvara-api-key}
chat:
options:
model: gpt-4o
temperature: 0.7
Service class:
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.stereotype.Service;
@Service
public class ChatService {
private final ChatClient chatClient;
public ChatService(ChatClient.Builder builder) {
this.chatClient = builder.build();
}
public String chat(String message) {
return chatClient.prompt()
.user(message)
.call()
.content();
}
// Streaming
public Flux<String> streamChat(String message) {
return chatClient.prompt()
.user(message)
.stream()
.content();
}
}
Structured output:
record CityInfo(String name, String country, int population, List<String> landmarks) {}
CityInfo city = chatClient.prompt()
.user("Tell me about Paris")
.call()
.entity(CityInfo.class);
System.out.printf("%s, %s — pop: %,d%n", city.name(), city.country(), city.population());
Switch models at runtime:
// Use GPT-4o
String gptResponse = chatClient.prompt()
.user("Hello!")
.options(OpenAiChatOptions.builder().model("gpt-4o").build())
.call()
.content();
// Use Claude — same client, different model
String claudeResponse = chatClient.prompt()
.user("Hello!")
.options(OpenAiChatOptions.builder().model("claude-sonnet-4-20250514").build())
.call()
.content();
Function calling:
@Bean
@Description("Get the current weather for a location")
public Function<WeatherRequest, WeatherResponse> getWeather() {
return request -> new WeatherResponse(request.city(), 22.5, "Sunny");
}
record WeatherRequest(String city) {}
record WeatherResponse(String city, double temperature, String condition) {}
String response = chatClient.prompt()
.user("What's the weather in Tokyo?")
.functions("getWeather")
.call()
.content();
Chinese Model SDKs
Many Chinese LLM providers offer SDKs that support OpenAI-compatible endpoints. When you configure these providers in Dvara, route through the gateway using the same pattern.
Alibaba Qwen / DashScope
Using OpenAI SDK (recommended):
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
response = client.chat.completions.create(
model="qwen-plus",
messages=[{"role": "user", "content": "你好,请介绍一下你自己。"}],
)
print(response.choices[0].message.content)
Baidu ERNIE / Wenxin
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
response = client.chat.completions.create(
model="ernie-4.0",
messages=[{"role": "user", "content": "请用中文解释机器学习。"}],
)
Zhipu AI / ChatGLM
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
response = client.chat.completions.create(
model="glm-4",
messages=[{"role": "user", "content": "What are the applications of deep learning?"}],
)
DeepSeek
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
response = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "Explain transformer architecture."}],
)
Moonshot AI (Kimi)
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)
response = client.chat.completions.create(
model="moonshot-v1-8k",
messages=[{"role": "user", "content": "Write a poem about the Great Wall."}],
)
Chinese model providers must be configured in Dvara's provider configuration before they can be used. Dvara routes requests based on model prefix — add a custom provider configuration for each provider you want to use.
Using Dvara Headers
All SDKs can pass custom headers for Dvara-specific features:
Trace ID
Pass X-Trace-ID for distributed tracing correlation:
# Python (OpenAI SDK)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
extra_headers={"X-Trace-ID": "my-trace-123"},
)
// TypeScript (OpenAI SDK)
const response = await client.chat.completions.create(
{
model: "gpt-4o",
messages: [{ role: "user", content: "Hello" }],
},
{ headers: { "X-Trace-ID": "my-trace-123" } },
);
// Java (LangChain4j)
OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.customHeaders(Map.of("X-Trace-ID", "my-trace-123"))
.build();
Session ID (Agentic)
Pass X-Session-Id for agent session tracking:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Continue our conversation"}],
extra_headers={"X-Session-Id": "agent-session-456"},
)
Migration Guide
Migrating from Direct Provider Access
- Deploy Dvara (see Docker Quickstart)
- Configure providers in Dvara (API keys, routes)
- Change base URL in your SDK client — no other code changes needed
- Create a Dvara API key and use it instead of provider-specific keys
| Before | After |
|---|---|
base_url="https://api.openai.com/v1" | base_url="http://dvara:8080/v1" |
api_key="sk-..." (OpenAI key) | api_key="mk-..." (Dvara key) |
Migrating from Another Gateway
If migrating from LiteLLM proxy, Portkey, or similar:
- Update the base URL to point to Dvara
- Replace the gateway-specific API key with a Dvara API key
- Review model names — Dvara uses standard provider model names (e.g.,
gpt-4o,claude-sonnet-4-20250514) - Dvara returns standard OpenAI-compatible responses, so no response parsing changes needed
Supported Endpoints
All SDKs can use these Dvara endpoints:
| Endpoint | Purpose |
|---|---|
POST /v1/chat/completions | Chat (all models) |
POST /v1/completions | Legacy text completion |
POST /v1/embeddings | Embeddings |
GET /v1/models | List available models |
Troubleshooting
Connection Refused
openai.APIConnectionError: Connection error.
Ensure Dvara is running and accessible at the configured base URL.
400 — NO_PROVIDER
{"error": {"code": "NO_PROVIDER", "message": "No provider supports model: xyz"}}
The requested model doesn't match any configured provider. Check GET /v1/models for available models.
401 — Unauthorized
{"error": {"code": "UNAUTHORIZED", "message": "Invalid API key"}}
The Dvara API key is invalid or missing. Create one via the admin UI or API.
403 — POLICY_DENIED
{"error": {"code": "POLICY_DENIED", "message": "Request blocked by policy"}}
A governance policy is blocking the request. Check your tenant's active policies.
502 — PROVIDER_ERROR
{"error": {"code": "PROVIDER_ERROR", "message": "Upstream provider error"}}
The upstream provider (OpenAI, Anthropic, etc.) returned an error. Dvara will attempt failover if configured.