Getting Started with DVARA: Drop-In OpenAI Compatibility in 5 Minutes

April 6, 2026 · 7 min read

DVARA is an AI governance platform. It governs every LLM call your teams make — policy, PII, budgets, audit — under one control plane, and the LLM Gateway component is fully OpenAI-compatible so governance kicks in on the very first request. Every OpenAI SDK — Python, Node, Go, Java, Rust — lets you override the base URL. That single line of configuration is all it takes to route your LLM traffic through DVARA and unlock governance, multi-provider routing, and observability without touching your application code.

This guide walks you through starting DVARA, sending your first governed request, switching providers, and verifying that everything works — all in under five minutes.

What you get by pointing your SDK at DVARA

Before diving into setup, here's what changes when your SDK talks to DVARA instead of directly to OpenAI:

Policy enforcement on every call — DENY and WARN_AGENT rules across models, budgets, time-of-day, and data residency, enforced in-line before any provider call. Policies are scoped per-tenant (or platform-global) at the entity level; SHADOW policy status lets you test a candidate policy against real traffic before promoting to ACTIVE.
PII scanning and redaction — every prompt is scanned for sensitive data (SSNs, credit cards, emails, medical record numbers) before it leaves your network, with per-tenant BLOCK / REDACT / LOG actions.
Immutable audit trail — every request is logged with tenant, model, provider, token counts, and latency, HMAC-signed and hash-chained for tamper-evidence.
Budget caps that enforce — hard and soft limits in dollars, not just requests per second. Hard breaches reject the request with 402; soft breaches trigger automatic model downgrade.
Multi-provider routing — send model: claude-sonnet-4-5 and DVARA routes to Anthropic, model: gpt-4o routes to OpenAI, model: gemini-2.5-pro routes to Google, all through the same endpoint.
Failover — if a provider returns a 5xx, DVARA retries with a healthy alternative automatically.

And none of this requires a single line of application code change. Your SDK still sees standard OpenAI-format responses.

Step 1: Start the Gateway

docker run -d \
  -p 8080:8080 \
  -e OPENAI_API_KEY=sk-your-key \
  ghcr.io/dvarahq/dvara/dvara-llm-gateway:1.0.0

The gateway starts on port 8080. Verify it's running:

curl http://localhost:8080/actuator/health

You should see {"status":"UP"}.

Step 2: Send Your First Request

Since DVARA is fully OpenAI-compatible, point any OpenAI SDK at it.

:::tip Why api_key is still needed in the SDK The provider API keys you set on the gateway (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.) are used by the gateway to authenticate with upstream providers. Your application never sees them.

The api_key in the SDK client is a separate concern — it's the key your application uses to authenticate with the gateway itself. In production, you'd create a tenant and issue a DVARA API key (prefixed gw_) through the DVARA Flightdeck. For local development without tenants configured, the gateway accepts any value, so "any-key" works as a placeholder. :::

Python

First, install the OpenAI Python SDK (requires Python 3.8+):

pip install openai

Create a file called dvara_test.py:

from openai import OpenAI

client = OpenAI(
    api_key="any-key",  # DVARA API key — use a real gw_ key in production
    base_url="http://localhost:8080/v1"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is the capital of India?"}]
)
print(response.choices[0].message.content)

Run it:

python dvara_test.py

You should see the model's response printed to your terminal.

TypeScript / Node.js

First, install the OpenAI Node.js SDK (requires Node.js 18+):

npm install openai

Create a file called dvara_test.mjs:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "any-key", // DVARA API key — use a real gw_ key in production
  baseURL: "http://localhost:8080/v1",
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "What is the capital of India?" }],
});
console.log(response.choices[0].message.content);

Run it:

node dvara_test.mjs

You should see the model's response printed to your terminal.

curl

curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer any-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "What is the capital of India?"}]
  }'

The response is a standard OpenAI chat completion object. Your application code doesn't know it's talking to a gateway.

Step 3: Add More Providers

The real power of an AI gateway appears when you add a second provider. The same SDK client, the same endpoint, the same code — just a different model name.

3.1 Stop the running gateway

docker stop dvara && docker rm dvara

3.2 Restart with both provider keys

docker run -d --name dvara \
  -p 8080:8080 \
  -e OPENAI_API_KEY=sk-your-key \
  -e ANTHROPIC_API_KEY=sk-ant-your-key \
  ghcr.io/dvarahq/dvara/dvara-llm-gateway:1.0.0

Wait ~15 seconds for the gateway to start, then verify it's up:

curl -s http://localhost:8080/actuator/health

You should see {"status":"UP"}. To inspect the registered providers, use the live /v1/models listing — it queries each registered provider in turn:

curl -s http://localhost:8080/v1/models -H "Authorization: Bearer gw_..."

(For the operator-only structured status payload — providers[], routes[], license envelope — see /actuator/gateway-status which requires Authorization: Bearer $DVARA_ACTUATOR_API_KEY.)

3.3 Send the same prompt to both providers

Create a file called multi_provider.py:

from openai import OpenAI

client = OpenAI(
    api_key="any-key",
    base_url="http://localhost:8080/v1"
)

# Routes to OpenAI
print("=== OpenAI (gpt-4o) ===")
openai_response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain virtual threads in Java in 2 sentences"}]
)
print(openai_response.choices[0].message.content)

# Routes to Anthropic — same client, same endpoint, same format
print("\n=== Anthropic (claude-sonnet-4-5) ===")
anthropic_response = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[{"role": "user", "content": "Explain virtual threads in Java in 2 sentences"}]
)
print(anthropic_response.choices[0].message.content)

Run it:

python multi_provider.py

Output:

=== OpenAI (gpt-4o) ===
Virtual threads in Java, introduced in Project Loom, are lightweight threads
managed by the JVM rather than the OS. They allow applications to handle
millions of concurrent tasks without the memory overhead of platform threads.

=== Anthropic (claude-sonnet-4-5) ===
Virtual threads are lightweight threads introduced in Java 21 through Project
Loom that are managed by the JVM rather than the operating system. They enable
writing simple blocking code that scales to millions of concurrent operations
without the complexity of reactive programming.

Notice: your code didn't change between providers. DVARA routes by model prefix — gpt goes to OpenAI, claude goes to Anthropic — and translates the request/response format automatically.

Supported providers

Model prefix	Provider	Environment variable
`gpt`, `o1`, `o3`, `o4`, `chatgpt`	OpenAI	`OPENAI_API_KEY`
`claude`	Anthropic	`ANTHROPIC_API_KEY`
`gemini`	Google Gemini	`GEMINI_API_KEY`
`bedrock/`	AWS Bedrock	`BEDROCK_ENABLED=true` + `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`
`azure/`	Azure OpenAI	`AZURE_OPENAI_API_KEY` + `AZURE_OPENAI_BASE_URL`
`mistral`	Mistral	`MISTRAL_API_KEY`
`command`	Cohere	`COHERE_API_KEY`
`groq/`	Groq	`GROQ_API_KEY`
`qwen`	Alibaba Qwen	`QWEN_API_KEY`
`deepseek`	DeepSeek	`DEEPSEEK_API_KEY`
`moonshot`	Moonshot (Kimi)	`MOONSHOT_API_KEY`
`glm`	Zhipu ChatGLM	`ZHIPU_API_KEY`
`grok`	xAI Grok	`XAI_API_KEY`
`ollama/`	Ollama (local)	`OLLAMA_ENABLED=true`

Add any provider by setting its env var and restarting the container. No config files needed.

What's Next

You now have a running AI gateway that routes to multiple providers through a single SDK.

Or jump straight to the docs:

LLM Gateway overview — the pillar guide: one API in front of every model, with routing, fallback, cost control, and observability.
Configure routing — round-robin, weighted, latency-aware, or cost-aware routing across providers.
Enable PII scanning — detect and redact sensitive data in prompts before they leave your network.
Set budget caps — prevent bill shock with per-tenant token budgets and automatic model downgrade.
Deploy with Docker Compose or Helm — production-ready deployment with PostgreSQL persistence.
Browse all SDK examples — Python, TypeScript, Java integration walkthroughs.

What you get by pointing your SDK at DVARA​

Step 1: Start the Gateway​

Step 2: Send Your First Request​

Python​

TypeScript / Node.js​

curl​

Step 3: Add More Providers​

3.1 Stop the running gateway​

3.2 Restart with both provider keys​

3.3 Send the same prompt to both providers​

Supported providers​

What's Next​

What you get by pointing your SDK at DVARA

Step 1: Start the Gateway

Step 2: Send Your First Request

Python

TypeScript / Node.js

curl

Step 3: Add More Providers

3.1 Stop the running gateway

3.2 Restart with both provider keys

3.3 Send the same prompt to both providers

Supported providers

What's Next