Skip to main content

SDK & Framework Integrations

Dvara is OpenAI API-compatible — any SDK or framework that supports the OpenAI chat completions API works with Dvara by simply changing the base URL. No custom SDK required.

┌─────────────────────────────┐
│ Your Existing Code │
│ (OpenAI SDK, LangChain, │
│ Spring AI, etc.) │
│ │
│ base_url = "http:// │
│ dvara:8080/v1" │
└──────────┬──────────────────┘


┌─────────────────────────────┐
│ Dvara AI Gateway │
│ (routing, policy, audit, │
│ caching, rate limiting) │
└──────────┬──────────────────┘

┌─────┼──────┐
▼ ▼ ▼
OpenAI Claude Gemini ...

Quick Reference

SDK / FrameworkLanguageConfig Change
OpenAI SDKPythonbase_url
OpenAI SDKJavaScript/TSbaseURL
Anthropic SDKPythonbase_url + model prefix
Anthropic SDKJavaScript/TSbaseURL + model prefix
AWS Bedrock SDKPythonSwitch to OpenAI SDK
Google GenAI SDKPythonSwitch to OpenAI SDK
LiteLLMPythonapi_base
LangChainPythonopenai_api_base
Pydantic AIPythonbase_url on OpenAI provider
LangChain4jJavabaseUrl
Spring AIJavaspring.ai.openai.base-url
Chinese Model SDKsPythonbase_url

Prerequisites

All examples assume:

  • Dvara is running at http://localhost:8080
  • A tenant and API key have been created (see Quickstart)
  • The target provider is configured in Dvara (e.g., OPENAI_API_KEY set)

Replace your-dvara-api-key with your actual Dvara API key throughout.


Python SDKs

OpenAI Python

The most common integration. Works with any model routed through Dvara.

Install:

pip install openai

Basic chat completion:

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in one paragraph."},
],
max_tokens=256,
)

print(response.choices[0].message.content)

Streaming:

stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a haiku about APIs."}],
stream=True,
)

for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")

Structured output (JSON schema):

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "List 3 planets with their diameter in km."}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "planets",
"strict": True,
"schema": {
"type": "object",
"properties": {
"planets": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"diameter_km": {"type": "integer"},
},
"required": ["name", "diameter_km"],
},
}
},
"required": ["planets"],
},
},
},
)

Tool calls / function calling:

tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
},
"required": ["city"],
},
},
}
]

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=tools,
tool_choice="auto",
)

# Check if the model wants to call a function
if response.choices[0].message.tool_calls:
tool_call = response.choices[0].message.tool_calls[0]
print(f"Function: {tool_call.function.name}")
print(f"Arguments: {tool_call.function.arguments}")

Embeddings:

response = client.embeddings.create(
model="text-embedding-3-small",
input="The quick brown fox jumps over the lazy dog.",
)

print(f"Dimensions: {len(response.data[0].embedding)}")

Using Claude via Dvara (same SDK):

# No SDK change — just switch the model name
response = client.chat.completions.create(
model="claude-sonnet-4-20250514",
messages=[{"role": "user", "content": "Hello from Anthropic via Dvara!"}],
)

Using Gemini via Dvara (same SDK):

response = client.chat.completions.create(
model="gemini-2.0-flash",
messages=[{"role": "user", "content": "Hello from Google via Dvara!"}],
)

Async client:

import asyncio
from openai import AsyncOpenAI

async def main():
client = AsyncOpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello async!"}],
)
print(response.choices[0].message.content)

asyncio.run(main())

Environment variable approach:

export OPENAI_BASE_URL="http://localhost:8080/v1"
export OPENAI_API_KEY="your-dvara-api-key"
# No base_url needed — SDK reads from environment
client = OpenAI()

Anthropic Python

If you're already using the Anthropic SDK and want to route through Dvara, point it at Dvara's base URL. Dvara translates the OpenAI-compatible request to the Anthropic API format internally.

Recommended

Use the OpenAI SDK with model="claude-*" for the simplest integration. Use the Anthropic SDK approach only if you have existing Anthropic SDK code you don't want to change.

Install:

pip install anthropic

Using OpenAI SDK (recommended):

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="claude-sonnet-4-20250514",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain the Rust ownership model."},
],
max_tokens=512,
)

AWS Bedrock

Instead of using boto3 with Bedrock's native API, use the OpenAI SDK pointed at Dvara. Dvara handles SigV4 signing and Bedrock API translation internally.

Before (boto3):

# ❌ Complex — requires AWS credentials, region config, Bedrock-specific API
import boto3
client = boto3.client("bedrock-runtime", region_name="us-east-1")
response = client.converse(
modelId="anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{"role": "user", "content": [{"text": "Hello"}]}],
)

After (via Dvara):

# ✅ Simple — standard OpenAI SDK, Dvara handles Bedrock translation
from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{"role": "user", "content": "Hello from Bedrock via Dvara!"}],
)

Google GenAI

Use the OpenAI SDK pointed at Dvara instead of the Google GenAI SDK. Dvara translates to the Gemini API format internally.

Before (google-genai):

# ❌ Google-specific API
from google import genai
client = genai.Client(api_key="...")
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="Explain how AI works",
)

After (via Dvara):

# ✅ Standard OpenAI SDK
from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="gemini-2.0-flash",
messages=[{"role": "user", "content": "Explain how AI works"}],
)

LiteLLM

LiteLLM already supports custom API bases. Point it at Dvara to centralize all provider calls.

Install:

pip install litellm

Basic usage:

import litellm

response = litellm.completion(
model="openai/gpt-4o", # provider/model format
api_base="http://localhost:8080/v1",
api_key="your-dvara-api-key",
messages=[{"role": "user", "content": "Hello via LiteLLM!"}],
)

print(response.choices[0].message.content)

Using different models through Dvara:

# All requests go through Dvara regardless of model
for model in ["openai/gpt-4o", "openai/claude-sonnet-4-20250514", "openai/gemini-2.0-flash"]:
response = litellm.completion(
model=model,
api_base="http://localhost:8080/v1",
api_key="your-dvara-api-key",
messages=[{"role": "user", "content": "Hello!"}],
)
print(f"{model}: {response.choices[0].message.content[:50]}")

Environment variable approach:

export OPENAI_API_BASE="http://localhost:8080/v1"
export OPENAI_API_KEY="your-dvara-api-key"

LangChain

LangChain's ChatOpenAI class accepts a custom base URL.

Install:

pip install langchain-openai

Basic chat:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
temperature=0.7,
)

response = llm.invoke("What is the capital of France?")
print(response.content)

With chains:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)

prompt = ChatPromptTemplate.from_messages([
("system", "You are a {role}. Answer concisely."),
("human", "{question}"),
])

chain = prompt | llm
response = chain.invoke({"role": "historian", "question": "Who built the pyramids?"})
print(response.content)

Streaming with chains:

for chunk in chain.stream({"role": "poet", "question": "Write about the moon"}):
print(chunk.content, end="")

With structured output:

from pydantic import BaseModel

class MovieReview(BaseModel):
title: str
rating: float
summary: str

structured_llm = llm.with_structured_output(MovieReview)
review = structured_llm.invoke("Review the movie Inception")
print(f"{review.title}: {review.rating}/10 - {review.summary}")

RAG example:

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)

embeddings = OpenAIEmbeddings(
model="text-embedding-3-small",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)

# Use embeddings with your vector store of choice
# e.g., Chroma, Pinecone, Weaviate, etc.

Agent with tools:

from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool

@tool
def calculator(expression: str) -> str:
"""Evaluate a math expression."""
return str(eval(expression))

llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)

prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, [calculator], prompt)
executor = AgentExecutor(agent=agent, tools=[calculator])
result = executor.invoke({"input": "What is 42 * 17 + 3?"})
print(result["output"])

Pydantic AI

Pydantic AI supports custom OpenAI-compatible endpoints.

Install:

pip install pydantic-ai

Basic usage:

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
"gpt-4o",
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

agent = Agent(model)

result = agent.run_sync("What is the capital of Japan?")
print(result.data)

With structured output:

from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel

class CityInfo(BaseModel):
name: str
country: str
population: int
famous_for: list[str]

model = OpenAIModel(
"gpt-4o",
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

agent = Agent(model, result_type=CityInfo)

result = agent.run_sync("Tell me about Tokyo")
city = result.data
print(f"{city.name}, {city.country} — pop: {city.population:,}")
print(f"Famous for: {', '.join(city.famous_for)}")

With tools:

from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
"gpt-4o",
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

agent = Agent(model, system_prompt="You help with math.")

@agent.tool_plain
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b

result = agent.run_sync("What is 123 times 456?")
print(result.data)

JavaScript / TypeScript SDKs

OpenAI JavaScript

Install:

npm install openai

Basic chat completion:

import OpenAI from "openai";

const client = new OpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "your-dvara-api-key",
});

const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain microservices in one paragraph." },
],
});

console.log(response.choices[0].message.content);

Streaming:

const stream = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Write a poem about code." }],
stream: true,
});

for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Structured output:

import { zodResponseFormat } from "openai/helpers/zod";
import { z } from "zod";

const Planet = z.object({
name: z.string(),
diameter_km: z.number(),
has_rings: z.boolean(),
});

const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Tell me about Saturn." }],
response_format: zodResponseFormat(Planet, "planet"),
});

const planet = JSON.parse(response.choices[0].message.content!);
console.log(`${planet.name}: ${planet.diameter_km} km, rings: ${planet.has_rings}`);

Tool calls:

const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "What's the weather in London?" }],
tools: [
{
type: "function",
function: {
name: "get_weather",
description: "Get weather for a city",
parameters: {
type: "object",
properties: { city: { type: "string" } },
required: ["city"],
},
},
},
],
});

const toolCall = response.choices[0].message.tool_calls?.[0];
if (toolCall) {
console.log(`Call: ${toolCall.function.name}(${toolCall.function.arguments})`);
}

Environment variable approach:

export OPENAI_BASE_URL="http://localhost:8080/v1"
export OPENAI_API_KEY="your-dvara-api-key"

Anthropic JavaScript

Same approach as Python — use the OpenAI SDK with Claude model names.

import OpenAI from "openai";

const client = new OpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "your-dvara-api-key",
});

const response = await client.chat.completions.create({
model: "claude-sonnet-4-20250514",
messages: [{ role: "user", content: "Hello from TypeScript!" }],
});

Vercel AI SDK

Install:

npm install ai @ai-sdk/openai

Usage:

import { createOpenAI } from "@ai-sdk/openai";
import { generateText, streamText } from "ai";

const dvara = createOpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "your-dvara-api-key",
});

// Non-streaming
const { text } = await generateText({
model: dvara("gpt-4o"),
prompt: "Explain REST APIs briefly.",
});

// Streaming
const result = streamText({
model: dvara("gpt-4o"),
prompt: "Write a short story about a robot.",
});

for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}

Java SDKs

LangChain4j

Maven dependency:

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
<version>1.0.0-beta1</version>
</dependency>

Basic chat:

import dev.langchain4j.model.openai.OpenAiChatModel;

OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.build();

String response = model.chat("What is the capital of Germany?");
System.out.println(response);

Streaming:

import dev.langchain4j.model.openai.OpenAiStreamingChatModel;
import dev.langchain4j.model.chat.response.StreamingChatResponseHandler;

OpenAiStreamingChatModel model = OpenAiStreamingChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.build();

model.chat("Write a haiku about Java.", new StreamingChatResponseHandler() {
@Override
public void onPartialResponse(String partialResponse) {
System.out.print(partialResponse);
}

@Override
public void onCompleteResponse(String completeResponse) {
System.out.println("\n--- Done ---");
}

@Override
public void onError(Throwable error) {
error.printStackTrace();
}
});

With AI Services (structured output):

import dev.langchain4j.service.AiServices;

interface MovieExpert {
@dev.langchain4j.service.UserMessage("Review the movie: {{movie}}")
MovieReview review(@dev.langchain4j.service.V("movie") String movie);
}

record MovieReview(String title, double rating, String summary) {}

OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.responseFormat("json")
.build();

MovieExpert expert = AiServices.create(MovieExpert.class, model);
MovieReview review = expert.review("Inception");
System.out.printf("%s: %.1f/10 - %s%n", review.title(), review.rating(), review.summary());

Using Claude via Dvara:

OpenAiChatModel claude = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("claude-sonnet-4-20250514")
.build();

String response = claude.chat("Hello from LangChain4j via Dvara!");

Spring AI

Spring AI has native OpenAI support with configurable base URL.

Maven dependency:

<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>

application.yml:

spring:
ai:
openai:
base-url: http://localhost:8080/v1
api-key: ${MERIDIAN_API_KEY:your-dvara-api-key}
chat:
options:
model: gpt-4o
temperature: 0.7

Service class:

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.stereotype.Service;

@Service
public class ChatService {

private final ChatClient chatClient;

public ChatService(ChatClient.Builder builder) {
this.chatClient = builder.build();
}

public String chat(String message) {
return chatClient.prompt()
.user(message)
.call()
.content();
}

// Streaming
public Flux<String> streamChat(String message) {
return chatClient.prompt()
.user(message)
.stream()
.content();
}
}

Structured output:

record CityInfo(String name, String country, int population, List<String> landmarks) {}

CityInfo city = chatClient.prompt()
.user("Tell me about Paris")
.call()
.entity(CityInfo.class);

System.out.printf("%s, %s — pop: %,d%n", city.name(), city.country(), city.population());

Switch models at runtime:

// Use GPT-4o
String gptResponse = chatClient.prompt()
.user("Hello!")
.options(OpenAiChatOptions.builder().model("gpt-4o").build())
.call()
.content();

// Use Claude — same client, different model
String claudeResponse = chatClient.prompt()
.user("Hello!")
.options(OpenAiChatOptions.builder().model("claude-sonnet-4-20250514").build())
.call()
.content();

Function calling:

@Bean
@Description("Get the current weather for a location")
public Function<WeatherRequest, WeatherResponse> getWeather() {
return request -> new WeatherResponse(request.city(), 22.5, "Sunny");
}

record WeatherRequest(String city) {}
record WeatherResponse(String city, double temperature, String condition) {}

String response = chatClient.prompt()
.user("What's the weather in Tokyo?")
.functions("getWeather")
.call()
.content();

Chinese Model SDKs

Many Chinese LLM providers offer SDKs that support OpenAI-compatible endpoints. When you configure these providers in Dvara, route through the gateway using the same pattern.

Alibaba Qwen / DashScope

Using OpenAI SDK (recommended):

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="qwen-plus",
messages=[{"role": "user", "content": "你好,请介绍一下你自己。"}],
)
print(response.choices[0].message.content)

Baidu ERNIE / Wenxin

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="ernie-4.0",
messages=[{"role": "user", "content": "请用中文解释机器学习。"}],
)

Zhipu AI / ChatGLM

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="glm-4",
messages=[{"role": "user", "content": "What are the applications of deep learning?"}],
)

DeepSeek

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "Explain transformer architecture."}],
)

Moonshot AI (Kimi)

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="moonshot-v1-8k",
messages=[{"role": "user", "content": "Write a poem about the Great Wall."}],
)
note

Chinese model providers must be configured in Dvara's provider configuration before they can be used. Dvara routes requests based on model prefix — add a custom provider configuration for each provider you want to use.


Using Dvara Headers

All SDKs can pass custom headers for Dvara-specific features:

Trace ID

Pass X-Trace-ID for distributed tracing correlation:

# Python (OpenAI SDK)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
extra_headers={"X-Trace-ID": "my-trace-123"},
)
// TypeScript (OpenAI SDK)
const response = await client.chat.completions.create(
{
model: "gpt-4o",
messages: [{ role: "user", content: "Hello" }],
},
{ headers: { "X-Trace-ID": "my-trace-123" } },
);
// Java (LangChain4j)
OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:8080/v1")
.apiKey("your-dvara-api-key")
.modelName("gpt-4o")
.customHeaders(Map.of("X-Trace-ID", "my-trace-123"))
.build();

Session ID (Agentic)

Pass X-Session-Id for agent session tracking:

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Continue our conversation"}],
extra_headers={"X-Session-Id": "agent-session-456"},
)

Migration Guide

Migrating from Direct Provider Access

  1. Deploy Dvara (see Docker Quickstart)
  2. Configure providers in Dvara (API keys, routes)
  3. Change base URL in your SDK client — no other code changes needed
  4. Create a Dvara API key and use it instead of provider-specific keys
BeforeAfter
base_url="https://api.openai.com/v1"base_url="http://dvara:8080/v1"
api_key="sk-..." (OpenAI key)api_key="mk-..." (Dvara key)

Migrating from Another Gateway

If migrating from LiteLLM proxy, Portkey, or similar:

  1. Update the base URL to point to Dvara
  2. Replace the gateway-specific API key with a Dvara API key
  3. Review model names — Dvara uses standard provider model names (e.g., gpt-4o, claude-sonnet-4-20250514)
  4. Dvara returns standard OpenAI-compatible responses, so no response parsing changes needed

Supported Endpoints

All SDKs can use these Dvara endpoints:

EndpointPurpose
POST /v1/chat/completionsChat (all models)
POST /v1/completionsLegacy text completion
POST /v1/embeddingsEmbeddings
GET /v1/modelsList available models

Troubleshooting

Connection Refused

openai.APIConnectionError: Connection error.

Ensure Dvara is running and accessible at the configured base URL.

400 — NO_PROVIDER

{"error": {"code": "NO_PROVIDER", "message": "No provider supports model: xyz"}}

The requested model doesn't match any configured provider. Check GET /v1/models for available models.

401 — Unauthorized

{"error": {"code": "UNAUTHORIZED", "message": "Invalid API key"}}

The Dvara API key is invalid or missing. Create one via the admin UI or API.

403 — POLICY_DENIED

{"error": {"code": "POLICY_DENIED", "message": "Request blocked by policy"}}

A governance policy is blocking the request. Check your tenant's active policies.

502 — PROVIDER_ERROR

{"error": {"code": "PROVIDER_ERROR", "message": "Upstream provider error"}}

The upstream provider (OpenAI, Anthropic, etc.) returned an error. Dvara will attempt failover if configured.