Skip to main content

Python SDK integrations

DVARA is an AI governance platform with an OpenAI-compatible data plane. Every Python SDK below works by changing the base URL — governance (policy, audit, PII, cost attribution) kicks in on call one. Examples assume DVARA is running at http://localhost:8080 and you've created a tenant + API key via the Quickstart.

OpenAI Python

The most common integration. Works with any model routed through DVARA.

Install:

pip install openai

Basic chat completion:

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in one paragraph."},
],
max_tokens=256,
)

print(response.choices[0].message.content)

Streaming:

stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a haiku about APIs."}],
stream=True,
)

for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")

Structured output (JSON schema):

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "List 3 planets with their diameter in km."}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "planets",
"strict": True,
"schema": {
"type": "object",
"properties": {
"planets": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"diameter_km": {"type": "integer"},
},
"required": ["name", "diameter_km"],
},
}
},
"required": ["planets"],
},
},
},
)

Tool calls / function calling:

tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
},
"required": ["city"],
},
},
}
]

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=tools,
tool_choice="auto",
)

if response.choices[0].message.tool_calls:
tool_call = response.choices[0].message.tool_calls[0]
print(f"Function: {tool_call.function.name}")
print(f"Arguments: {tool_call.function.arguments}")

Embeddings:

response = client.embeddings.create(
model="text-embedding-3-small",
input="The quick brown fox jumps over the lazy dog.",
)

print(f"Dimensions: {len(response.data[0].embedding)}")

Using Claude via DVARA (same SDK):

# No SDK change — just switch the model name
response = client.chat.completions.create(
model="claude-sonnet-4-20250514",
messages=[{"role": "user", "content": "Hello from Anthropic via DVARA!"}],
)

Using Gemini via DVARA (same SDK):

response = client.chat.completions.create(
model="gemini-2.0-flash",
messages=[{"role": "user", "content": "Hello from Google via DVARA!"}],
)

Async client:

import asyncio
from openai import AsyncOpenAI

async def main():
client = AsyncOpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello async!"}],
)
print(response.choices[0].message.content)

asyncio.run(main())

Environment variable approach:

export OPENAI_BASE_URL="http://localhost:8080/v1"
export OPENAI_API_KEY="your-dvara-api-key"
# No base_url needed — SDK reads from environment
client = OpenAI()

Anthropic Python

If you're already using the Anthropic SDK and want to route through DVARA, point it at DVARA's base URL. DVARA translates the OpenAI-compatible request to the Anthropic API format internally.

Recommended

Use the OpenAI SDK with model="claude-*" for the simplest integration. Use the Anthropic SDK approach only if you have existing Anthropic SDK code you don't want to change.

Install:

pip install anthropic

Using OpenAI SDK (recommended):

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="claude-sonnet-4-20250514",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain the Rust ownership model."},
],
max_tokens=512,
)

AWS Bedrock

Instead of using boto3 with Bedrock's native API, use the OpenAI SDK pointed at DVARA. DVARA handles SigV4 signing and Bedrock API translation internally.

Before (boto3):

# Complex — requires AWS credentials, region config, Bedrock-specific API
import boto3
client = boto3.client("bedrock-runtime", region_name="us-east-1")
response = client.converse(
modelId="anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{"role": "user", "content": [{"text": "Hello"}]}],
)

After (via DVARA):

# Simple — standard OpenAI SDK, DVARA handles Bedrock translation
from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{"role": "user", "content": "Hello from Bedrock via DVARA!"}],
)

Google GenAI

Use the OpenAI SDK pointed at DVARA instead of the Google GenAI SDK. DVARA translates to the Gemini API format internally.

Before (google-genai):

from google import genai
client = genai.Client(api_key="...")
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="Explain how AI works",
)

After (via DVARA):

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

response = client.chat.completions.create(
model="gemini-2.0-flash",
messages=[{"role": "user", "content": "Explain how AI works"}],
)

LiteLLM

LiteLLM already supports custom API bases. Point it at DVARA to centralize governance across every provider.

Install:

pip install litellm

Basic usage:

import litellm

response = litellm.completion(
model="openai/gpt-4o", # provider/model format
api_base="http://localhost:8080/v1",
api_key="your-dvara-api-key",
messages=[{"role": "user", "content": "Hello via LiteLLM!"}],
)

print(response.choices[0].message.content)

Using different models through DVARA:

# LiteLLM's "openai/" prefix only declares the SDK transport.
# The bare model name (gpt-4o, claude-sonnet-4-5, gemini-2.5-pro)
# is forwarded to DVARA verbatim — DVARA's own model-prefix
# routing then sends each to the right upstream.
for model in ["openai/gpt-4o", "openai/claude-sonnet-4-5", "openai/gemini-2.5-pro"]:
response = litellm.completion(
model=model,
api_base="http://localhost:8080/v1",
api_key="your-dvara-api-key",
messages=[{"role": "user", "content": "Hello!"}],
)
print(f"{model}: {response.choices[0].message.content[:50]}")

Environment variable approach:

export OPENAI_API_BASE="http://localhost:8080/v1"
export OPENAI_API_KEY="your-dvara-api-key"

LangChain

LangChain's ChatOpenAI class accepts a custom base URL.

Install:

pip install langchain-openai

Basic chat:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
temperature=0.7,
)

response = llm.invoke("What is the capital of France?")
print(response.content)

With chains:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)

prompt = ChatPromptTemplate.from_messages([
("system", "You are a {role}. Answer concisely."),
("human", "{question}"),
])

chain = prompt | llm
response = chain.invoke({"role": "historian", "question": "Who built the pyramids?"})
print(response.content)

Streaming with chains:

for chunk in chain.stream({"role": "poet", "question": "Write about the moon"}):
print(chunk.content, end="")

With structured output:

from pydantic import BaseModel

class MovieReview(BaseModel):
title: str
rating: float
summary: str

structured_llm = llm.with_structured_output(MovieReview)
review = structured_llm.invoke("Review the movie Inception")
print(f"{review.title}: {review.rating}/10 - {review.summary}")

RAG example (embeddings through DVARA):

from langchain_openai import ChatOpenAI, OpenAIEmbeddings

llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)

embeddings = OpenAIEmbeddings(
model="text-embedding-3-small",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)

# Use embeddings with your vector store of choice (Chroma, Pinecone, Weaviate, etc.)

Agent with tools:

from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool

@tool
def calculator(expression: str) -> str:
"""Evaluate a math expression."""
return str(eval(expression))

llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://localhost:8080/v1",
openai_api_key="your-dvara-api-key",
)

prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, [calculator], prompt)
executor = AgentExecutor(agent=agent, tools=[calculator])
result = executor.invoke({"input": "What is 42 * 17 + 3?"})
print(result["output"])

Pydantic AI

Pydantic AI supports custom OpenAI-compatible endpoints.

Install:

pip install pydantic-ai

Basic usage:

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
"gpt-4o",
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

agent = Agent(model)

result = agent.run_sync("What is the capital of Japan?")
print(result.data)

With structured output:

from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel

class CityInfo(BaseModel):
name: str
country: str
population: int
famous_for: list[str]

model = OpenAIModel(
"gpt-4o",
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

agent = Agent(model, result_type=CityInfo)

result = agent.run_sync("Tell me about Tokyo")
city = result.data
print(f"{city.name}, {city.country} — pop: {city.population:,}")
print(f"Famous for: {', '.join(city.famous_for)}")

With tools:

from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
"gpt-4o",
base_url="http://localhost:8080/v1",
api_key="your-dvara-api-key",
)

agent = Agent(model, system_prompt="You help with math.")

@agent.tool_plain
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b

result = agent.run_sync("What is 123 times 456?")
print(result.data)

Next steps