One drop-in OpenAI-compatible endpoint in front of 14 LLM providers. Built on the open standards your platform team already runs.
Everything You Need to Govern AI at Scale
Policy-as-Code, immutable audit, PII redaction, and cost control on every plan; agentic AI and MCP governance on Growth — from your first API call to enterprise audit and compliance.
YAML DSL for model allowlists, token limits, MCP tool restrictions, budget-based downgrades, and time-of-day rules. Dry-run mode before activation. Draft → Active → Shadow lifecycle with versioning and rollback.
In-process PII detection (regex always-on + Phileas opt-in, 22 filter types) and prompt-injection / jailbreak detection on every call. Content filters, output-schema enforcement, and grounding/hallucination checks round out coverage of 7 of the 10 OWASP LLM Top 10 categories (see below). Optional ML classifiers (Lakera, Shield Gemini) and HTTP plugins.
Every request HMAC-SHA256 signed with hash-chain integrity verification. Tamper detection built in. SOC2, HIPAA, and GDPR evidence packages generated on demand.
MCP server registry and governed tool-call proxy with PII scanning on arguments and responses, human approval gates, and credential centralisation. Agent loop detection with auto-kill, session tracking across LLM and MCP calls, full session timeline.
Row-level tenant isolation on every query. Each tenant brings their own provider keys (AES-256-GCM at rest or vault-reference). Strict-BYOK rejects platform fallback when enabled. 3-level config hierarchy: global → tenant → API key.
Real-time per-request cost calculation. Budget caps per tenant, team, or API key. Automatic model downgrade at soft limit. Chargeback reports and cost forecasting.
Drop-in OpenAI-compatible endpoint. Route to OpenAI, Anthropic, Gemini, Azure, Bedrock, Ollama and 8 more with zero code changes. Streaming, structured outputs, and vision built-in.
Round-robin, weighted, capability-aware, latency-aware, and cost-aware routing. Circuit breakers, failover, and retry with exponential backoff. Canary and A/B testing.
Prometheus metrics, OpenTelemetry distributed tracing, structured JSON logging. Reference Grafana dashboards in dvara-examples. Unified traces spanning LLM turns and MCP tool calls.
Two Governed Data Planes. One Control Plane.
The LLM Gateway sits at the edge for model traffic. The MCP Proxy sits inside your perimeter for tool calls. Both share the same policies, audit trail, and API keys.
Built for Production Scale
Governance with near-zero performance tax. Every layer is optimised for throughput — lightweight concurrency, tamper-evident audit, multi-layer caching, and zero-copy streaming.
Governance Is the Architecture, Not a Feature
In every competitor, governance is bolted onto a proxy. In DVARA, governance is the design constraint that shaped every layer.
Full YAML DSL with version control, conflict detection at load time, and dry-run mode that evaluates the policy against a provided context before activation. Your auditors get proof that policies were tested before deployment.
Every record HMAC-SHA256 signed and hash-chained on the response path. Chain continuity is verifiable end-to-end — any gap or alteration is detectable. Append-only by application invariant, not by hopeful convention.
Every MCP tool call is HMAC-SHA256 signed and hash-chained into the same append-only audit as your LLM traffic — one tamper-evident record across model calls and tool calls, policy-checked and PII-scanned on a single control plane.
Full OpenTelemetry trace spanning LLM turns + MCP calls. Agent loop detection with auto-kill. Human-in-the-loop approval for high-risk tool calls. Session-level cost and compliance summary.
Audit-Ready from Day One
Generate compliance evidence packages on demand. Every request is immutably logged, every policy decision recorded, every PII event tracked.


OWASP LLM Top 10 Coverage
DVARA addresses 7 of the 10 OWASP categories at the gateway layer, with 2 partial-coverage items and 1 outside the gateway boundary. Every detection writes a signed audit event you can hand to your auditor.
Pre-filter pipeline with regex patterns + ML classifier plugins (Lakera, Shield Gemini). Per-tenant custom injection patterns. Detected requests blocked or flagged with policy decision recorded.
In-process PII detection (regex always-on + Phileas opt-in, 22 filter types). BLOCK / REDACT / LOG actions per tenant. Response scanning before delivery. Reversible tokenisation for downstream re-identification.
BYOK credential model with AES-256-GCM at rest, vault integration (HashiCorp / AWS / Azure), rotation with grace period, and audit of every credential resolution. Provider model selection is operator-controlled, not auto-pulled.
Out of scope at the gateway layer — training-time concern handled by the model provider. DVARA enforces provider allowlists so unvetted models cannot be used.
Output schema validation (JSON Schema enforcement across providers), structured-output translation, post-filter pipeline for XSS / SQLi / SSRF patterns, and response sanitisation before delivery.
MCP tool-call proxy with per-tool policy rules, human approval gates enforced at execution time, agent loop detection with auto-kill, session-level tool catalogue, and per-tool cost attribution.
Guardrail patterns detect prompt-extraction attempts. Output-side scanning catches leaked system prompts before they leave the gateway. Per-tenant denylist for sensitive prompt fragments.
Semantic cache uses tenant-isolated vector stores so a poisoned cache entry from one tenant cannot affect another. Signed audit of every cache hit + miss; cache configurations are operator-managed.
Grounding / hallucination detection via embedding-similarity scoring of model output against operator-supplied source documents. Below-threshold claims flagged, logged, or blocked per policy.
Hard + soft budget caps per tenant / team / API key, automatic model downgrade on soft breach, rate limiting, max-tokens enforcement, context-window governance with auto-prune, and request size limits.
How DVARA Compares
Most gateways now cover the basics. DVARA's depth is the gap that remains — tamper-evident signed audit, argument-level MCP policy, PII on tool-call arguments, and agentic kill-switches. We grade DVARA against 10 gateways across 50 capabilities.
Developers Love It. Compliance Requires It.
Every stakeholder gets exactly what they need from the same platform.
One endpoint. Drop-in compatibility. Route to any provider with automatic failover. Add streaming, rate limiting, and observability in minutes.
Two governed data planes, one control plane. LLM Gateway at the edge, MCP Proxy inside the perimeter. Adopt incrementally — add tool governance when agents go to production.
Immutable audit trail with tamper-evident signatures. PII detected and redacted before reaching providers. Compliance evidence packages generated on demand.
Hard policy enforcement at the gateway. PII blocked before it leaves your network. Role-based access to authorised models only. Every request logged and auditable.
Real-time cost tracking per request. Budget caps that automatically enforce. Monthly chargeback reports. Semantic caching eliminates repeated token spend on duplicate prompts.
Two governed data planes — AI at the edge, tools inside the perimeter — managed as one. Production-grade performance. Enterprise-ready from day one.
Flat monthly pricing. No per-request fees.
Plans scale by monthly token volume, not by metering every call. Annual billing includes two months free.
Questions teams ask before they deploy.
Is DVARA open source?
No — DVARA is a commercial AI governance platform, not open source. You can self-host it in your own infrastructure or run it as a managed service, and every plan starts with a free 30-day trial — no credit card.
Can we self-host DVARA?
Yes. DVARA runs in your own infrastructure as container images, with bring-your-own-key credentials so provider keys and request data never leave your perimeter. A fully managed option is available if you would rather not operate it yourself.
How is DVARA different from LiteLLM?
LiteLLM is an open-source library and proxy for multi-provider routing. DVARA is an AI governance platform — Policy-as-Code, immutable signed audit, PII and injection guardrails, and argument-level MCP tool-call governance are built into the request path, not bolted on. See the full comparison →
What is the latency overhead?
Governance runs in-process on the request path — there is no extra network hop to a separate policy service. Policy checks add single-digit milliseconds in typical configurations, the tenant lookup resolves in under a millisecond, and optional caches add no overhead when disabled. The Flightdeck dashboard reports live P95 latency for your own traffic.
Which LLM providers are supported?
14 today — OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, Mistral, Cohere, Groq, and more — behind one OpenAI-compatible endpoint, plus self-hosted models via Ollama.
Do we need to change our application code?
No. DVARA is drop-in OpenAI-compatible — point any OpenAI SDK or tool at the gateway URL and governance applies on the first call. Tool calls from AI agents are governed the same way through the MCP Proxy.
The evidence your auditor asks for — built into every request, not bolted on after.
Preparing for a SOC 2 audit? See the evidence DVARA generates →
See It in Action.
Start Your Free Trial Today.
Full access for 30 days. No credit card. Deploy in under 2 minutes and see governance working on your first request.