Getting Started with DVARA: Drop-In OpenAI Compatibility in 5 Minutes
DVARA is an AI governance platform. It governs every LLM call your teams make — policy, PII, budgets, audit — under one control plane, and the LLM Gateway component is fully OpenAI-compatible so governance kicks in on the very first request. Every OpenAI SDK — Python, Node, Go, Java, Rust — lets you override the base URL. That single line of configuration is all it takes to route your LLM traffic through DVARA and unlock governance, multi-provider routing, and observability without touching your application code.
This guide walks you through starting DVARA, sending your first governed request, switching providers, and verifying that everything works — all in under five minutes.
What you get by pointing your SDK at DVARA
Before diving into setup, here's what changes when your SDK talks to DVARA instead of directly to OpenAI:
- Policy enforcement on every call —
DENYandWARN_AGENTrules across models, budgets, time-of-day, and data residency, enforced in-line before any provider call. Policies are scoped per-tenant (or platform-global) at the entity level;SHADOWpolicy status lets you test a candidate policy against real traffic before promoting toACTIVE. - PII scanning and redaction — every prompt is scanned for sensitive data (SSNs, credit cards, emails, medical record numbers) before it leaves your network, with per-tenant
BLOCK/REDACT/LOGactions. - Immutable audit trail — every request is logged with tenant, model, provider, token counts, and latency, HMAC-signed and hash-chained for tamper-evidence.
- Budget caps that enforce — hard and soft limits in dollars, not just requests per second. Hard breaches reject the request with
402; soft breaches trigger automatic model downgrade. - Multi-provider routing — send
model: claude-sonnet-4-5and DVARA routes to Anthropic,model: gpt-4oroutes to OpenAI,model: gemini-2.5-proroutes to Google, all through the same endpoint. - Failover — if a provider returns a 5xx, DVARA retries with a healthy alternative automatically.
And none of this requires a single line of application code change. Your SDK still sees standard OpenAI-format responses.
Step 1: Start the Gateway
docker run -d \
-p 8080:8080 \
-e OPENAI_API_KEY=sk-your-key \
dvarahq/dvara-gateway:latest
The gateway starts on port 8080. Verify it's running:
curl http://localhost:8080/actuator/health
You should see {"status":"UP"}.
Step 2: Send Your First Request
Since DVARA is fully OpenAI-compatible, point any OpenAI SDK at it.
:::tip Why api_key is still needed in the SDK
The provider API keys you set on the gateway (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.) are used by the gateway to authenticate with upstream providers. Your application never sees them.
The api_key in the SDK client is a separate concern — it's the key your application uses to authenticate with the gateway itself. In production, you'd create a tenant and issue a DVARA API key (prefixed gw_) through the DVARA Flightdeck. For local development without tenants configured, the gateway accepts any value, so "any-key" works as a placeholder.
:::
Python
First, install the OpenAI Python SDK (requires Python 3.8+):
pip install openai
Create a file called dvara_test.py:
from openai import OpenAI
client = OpenAI(
api_key="any-key", # DVARA API key — use a real gw_ key in production
base_url="http://localhost:8080/v1"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is the capital of India?"}]
)
print(response.choices[0].message.content)
Run it:
python dvara_test.py
You should see the model's response printed to your terminal.
TypeScript / Node.js
First, install the OpenAI Node.js SDK (requires Node.js 18+):
npm install openai
Create a file called dvara_test.mjs:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "any-key", // DVARA API key — use a real gw_ key in production
baseURL: "http://localhost:8080/v1",
});
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "What is the capital of India?" }],
});
console.log(response.choices[0].message.content);
Run it:
node dvara_test.mjs
You should see the model's response printed to your terminal.
curl
curl http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer any-key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "What is the capital of India?"}]
}'
The response is a standard OpenAI chat completion object. Your application code doesn't know it's talking to a gateway.
Step 3: Add More Providers
The real power of an AI gateway appears when you add a second provider. The same SDK client, the same endpoint, the same code — just a different model name.
3.1 Stop the running gateway
docker stop dvara && docker rm dvara
3.2 Restart with both provider keys
docker run -d --name dvara \
-p 8080:8080 \
-e OPENAI_API_KEY=sk-your-key \
-e ANTHROPIC_API_KEY=sk-ant-your-key \
dvarahq/dvara-gateway:latest
Wait ~15 seconds for the gateway to start, then verify it's up:
curl -s http://localhost:8080/actuator/health
You should see {"status":"UP"}. To inspect the registered providers, use the live /v1/models listing — it queries each registered provider in turn:
curl -s http://localhost:8080/v1/models -H "Authorization: Bearer gw_..."
(For the operator-only structured status payload — providers[], routes[], license envelope — see /actuator/gateway-status which requires Authorization: Bearer $DVARA_ACTUATOR_API_KEY.)
3.3 Send the same prompt to both providers
Create a file called multi_provider.py:
from openai import OpenAI
client = OpenAI(
api_key="any-key",
base_url="http://localhost:8080/v1"
)
# Routes to OpenAI
print("=== OpenAI (gpt-4o) ===")
openai_response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain virtual threads in Java in 2 sentences"}]
)
print(openai_response.choices[0].message.content)
# Routes to Anthropic — same client, same endpoint, same format
print("\n=== Anthropic (claude-sonnet-4-5) ===")
anthropic_response = client.chat.completions.create(
model="claude-sonnet-4-5",
messages=[{"role": "user", "content": "Explain virtual threads in Java in 2 sentences"}]
)
print(anthropic_response.choices[0].message.content)
Run it:
python multi_provider.py
Output:
=== OpenAI (gpt-4o) ===
Virtual threads in Java, introduced in Project Loom, are lightweight threads
managed by the JVM rather than the OS. They allow applications to handle
millions of concurrent tasks without the memory overhead of platform threads.
=== Anthropic (claude-sonnet-4-5) ===
Virtual threads are lightweight threads introduced in Java 21 through Project
Loom that are managed by the JVM rather than the operating system. They enable
writing simple blocking code that scales to millions of concurrent operations
without the complexity of reactive programming.
Notice: your code didn't change between providers. DVARA routes by model prefix — gpt goes to OpenAI, claude goes to Anthropic — and translates the request/response format automatically.
Supported providers
| Model prefix | Provider | Environment variable |
|---|---|---|
gpt | OpenAI | OPENAI_API_KEY |
claude | Anthropic | ANTHROPIC_API_KEY |
gemini | Google Gemini | GEMINI_API_KEY |
azure/ | Azure OpenAI | AZURE_OPENAI_API_KEY + AZURE_OPENAI_BASE_URL |
mistral | Mistral | MISTRAL_API_KEY |
command | Cohere | COHERE_API_KEY |
groq/ | Groq | GROQ_API_KEY |
ollama/ | Ollama (local) | OLLAMA_ENABLED=true |
Add any provider by setting its env var and restarting the container. No config files needed.
What's Next
You now have a running AI gateway that routes to multiple providers through a single SDK.
Or jump straight to the docs:
- Configure routing — round-robin, weighted, latency-aware, or cost-aware routing across providers.
- Enable PII scanning — detect and redact sensitive data in prompts before they leave your network.
- Set budget caps — prevent bill shock with per-tenant token budgets and automatic model downgrade.
- Deploy with Docker Compose or Helm — production-ready deployment with PostgreSQL persistence.
- Browse all SDK examples — Python, TypeScript, Java integration walkthroughs.