Version: Latest (1.7.x dev)

What is DVARA?

DVARA is an AI governance platform. It governs every AI call your teams make — model calls to LLM providers and tool calls from AI agents — under a single control plane with Policy-as-Code, immutable audit, PII redaction, human-in-the-loop approval gates, and per-tenant cost attribution.

Where most AI infrastructure treats governance as a feature added to a proxy, DVARA treats governance as the architecture. The DVARA LLM Gateway at the network edge, the DVARA MCP Proxy inside the perimeter, and the opt-in DVARA A2A Proxy for agent-to-agent hops are governed data planes for one governance model — the same policy discipline, audit trail, RBAC, and compliance reports cover all of them.

Drop-in OpenAI compatibility means governance kicks in on call one. Point any OpenAI SDK at the DVARA LLM Gateway and every request is policy-evaluated, PII-scanned, budget-enforced, audited, and cost-attributed before it leaves your perimeter. Provider failover, multi-region routing, structured outputs, and semantic caching are all there too — but they exist to serve the governance story, not the other way round.

Who this is for

DVARA is for teams where AI governance is a compliance obligation, not a nice-to-have:

Regulated enterprises (HIPAA, SOC2, PCI-DSS, GDPR, public sector) — PII in prompts is a breach event and every AI action needs an audit trail.
Startups landing enterprise contracts — where SOC 2 / HIPAA / SSO turn up as buying criteria in the first serious deal, and shipping a tamper-evident audit trail + Policy-as-Code beats explaining you'll add governance once you have headcount.
Multi-tenant SaaS vendors — embedding AI into products they sell, where each customer needs its own policy, cost attribution, and isolation.
Teams running agentic workflows — where MCP-style tool calls and autonomous loops need policy enforcement, loop detection with kill switches, and human approval gates at execution, not after the fact.
Platform engineers — owning the sprawl of teams calling LLMs and agents directly with no governance, no cost visibility, and no audit trail.
Compliance officers and CISOs — who need a real answer to "prove PII never left the network" and "show me an immutable record of every AI action".

Key capabilities

Governance and guardrails

Policy-as-Code — YAML DENY / WARN_AGENT rules with conditions on models, tools, time-of-day, data residency, budget utilization, and MCP operations. For patterns the closed-form conditions can't compose (OR, NOT, arithmetic), rules can use a per-rule CEL expression: field instead — sandboxed by construction, type-checked at compile, the same expression language Kubernetes admission policies use. Policies are scoped per-tenant (or platform-global) at the entity level, with load-time conflict detection and a dry-run mode that evaluates the candidate policy against a provided context before activating. SHADOW policy status runs a policy against live traffic to compare its decisions against the active set before promotion to ACTIVE.
Immutable audit trail — every event is cryptographically signed and hash-chained at write time, with SIEM export (Kafka, Splunk HEC, CloudWatch) and on-demand SOC2 / HIPAA / GDPR / India RBI / SEBI reports.
PII detection and redaction — checksum-validated regex always on (Luhn-validated credit cards, SSN, email, phone, IP, etc.) plus the in-process embedded scanner with 22 deterministic filter types auto-enabled (and 8 additional dictionary-based filters opt-in per tenant), and optional Microsoft Presidio NER as an additive layer; per-tenant BLOCK / REDACT / LOG and authorised de-tokenization.
Input and output guardrails — coverage across 7 of the 10 OWASP LLM Top 10 categories: prompt injection (LLM01), sensitive information disclosure (LLM02), improper output handling (LLM05), excessive agency (LLM06), system-prompt leakage (LLM07), misinformation (LLM09 — grounding / hallucination checks), and unbounded consumption (LLM10), plus content filters and per-tenant denylists. Extensible with your own scanner via a signed webhook plugin.
Data residency as routing enforcement — per-tenant region pinning enforced at the routing layer, so a non-compliant request fails before it leaves — with an audit record of the decision.

Agentic governance

MCP Proxy — a governed component inside your perimeter that policy-checks every tool call, PII-scans the arguments and the response, audits before and after, and gates high-risk tools for human approval. The agent can't bypass it — DVARA holds the credential to the MCP server.
Agent loop detection — a per-session detector with three patterns (same-tool repetition, A→B cycles, calls-per-minute rate) terminates runaway loops before they burn budget, with the reason logged to the audit trail.
End-to-end agent trace — one trace ID spans every LLM turn and tool call, showing which turn triggered which call and which policy fired.

FinOps and cost enforcement

Per-tenant cost attribution — real-time token counting and USD conversion against a versioned pricing table; every request is a cost line item tagged by tenant, API key, model, and provider.
Budget caps that enforce — hard and soft limits at the global, tenant, and API-key level; hard breaches reject the request, soft breaches alert and auto-downgrade the model. Forecasts, anomaly detection, and monthly chargeback reports are included.

Multi-tenancy and access control

DVARA is multi-tenant from the ground up: all tenants share one fleet — no per-tenant pods, sidecars, or runtime sharding — with isolated API keys, policies, budgets, audit trails, usage data, and credentials enforced at every layer.

Two auth boundaries — the data plane authenticates each request by API key (which resolves the tenant, with no extra headers); the admin API uses built-in email/password (default), OIDC, or SAML 2.0 SSO.
Six-role RBAC — three platform roles (owner, policy-admin, billing-admin) and three tenant roles (admin, developer, viewer).
BYOK provider credentials — tenants bring their own provider keys, encrypted per tenant and rotatable without restart; cross-tenant leakage is impossible by construction.
Network controls — per-tenant IP allowlists, per-provider mTLS, TLS 1.3 on every outbound connection, and vault integration (HashiCorp, AWS Secrets Manager, Azure Key Vault).

Drop-in compatibility and routing

These mechanisms make the governance possible — the LLM Gateway speaks OpenAI, so the governance layer lands in front of existing code with zero changes.

OpenAI-compatible API — one endpoint for every provider; switch models by changing a single field, no SDK rewrites. Beyond chat completions and embeddings, the data plane serves the OpenAI Responses API (/v1/responses), the Batch API (/v1/batches + /v1/files), and function-calling passthrough — all through the same governance pipeline.
Multi-provider routing + failover — fourteen providers with many routing strategies (round-robin, weighted, latency-aware, cost-aware, capability-aware, canary, A/B, shadow, geo-aware) and circuit-breaker failover to a healthy alternative.
Structured outputs — response_format: json_schema works identically across OpenAI, Anthropic, Gemini, Bedrock, Mistral, Azure, and Grok, even where the provider has no native support. Semantic caching and shadow traffic round out the routing layer.

Platform and operations

DVARA Flightdeck and tenant portal — a web console for platform admins and a self-service portal where tenant users manage their own keys, credentials, usage, and team.
GitOps config-as-code — export the full configuration as JSON and import it with merge / replace and a dry-run preview.
Minimal infrastructure — PostgreSQL is the only external dependency; rate limiting, caching, and cross-node coordination are built in (no Redis, no message broker).

Supported providers

DVARA routes to fourteen first-class providers, each registering automatically when its credentials are present and matched by model prefix. Tenants can also bring their own keys (BYOK) through the Flightdeck.

OpenAI — gpt-*, o1-*, o3-*, o4-*, chatgpt-*
Anthropic — claude-*
Google Gemini — gemini-*
AWS Bedrock — bedrock/*
Azure OpenAI — azure/*
Mistral — mistral*
Cohere — command*
Groq — groq/*
xAI Grok — grok*
Alibaba Qwen — qwen*
DeepSeek — deepseek*
Moonshot (Kimi) — moonshot*
Zhipu ChatGLM — glm*
Ollama (local / self-hosted) — ollama/*

Plus a built-in Mock provider (mock/*) for CI and local development — dev / CI only; a startup WARN fires if it's enabled on a production profile.

See Providers for the full activation and capability matrix, Mock Provider for the file-backed scenario system, and GET /v1/models for the live list of models in your deployment.

Architecture

The platform ships as three services plus PostgreSQL — the LLM Gateway and the MCP Proxy are governed data planes, the DVARA Flightdeck is the control plane (REST API, platform dashboard, and tenant self-service portal). Every request walks the same policy / PII / guardrail / budget / routing / audit pipeline before the single upstream hop.

See Platform architecture for the system diagram, the request lifecycle, and a table mapping every governance feature to the pipeline stage it runs in.

What's next

Quickstart — run DVARA locally in under five minutes.
Platform architecture — the system diagram, request lifecycle, and where each governance feature runs.
DVARA Flightdeck — the platform console, tenant portal, and BYOK walkthrough.
API reference — interactive OpenAPI reference with a Try it panel on every operation.

The sidebar carries the rest — providers, routing, the individual governance features, deployment topologies, and the security checklist.

Who this is for​

Key capabilities​

Governance and guardrails​

Agentic governance​

FinOps and cost enforcement​

Multi-tenancy and access control​

Drop-in compatibility and routing​

Platform and operations​

Supported providers​

Architecture​

What's next​