Skip to main content
30-Day Free Trial · No Credit Card Required

Govern, audit, and route every LLM and MCP call.

Agents are reaching production faster than governance can keep up. DVARA is the AI governance platform that closes the gap — policy, immutable audit, and cost control on every LLM and MCP call.

dvara-gatewayzsh
$docker run -p 8080:8080 -e OPENAI_API_KEY=sk-... \
> ghcr.io/dvarahq/dvara/dvara-llm-gateway:1.0.0
DVARA Gateway started on :8080
LLM Proxy ready :8080 · MCP Proxy :8070 → dvara-mcp-gateway
$ curl -X POST localhost:8080/v1/chat/completions \
> -H "Authorization: Bearer $TOKEN" \
> -d '{"model":"gpt-4o","messages":[...]}'
policy: PASS pii-scan: CLEAN audit: SIGNED cost: $0.0032
{"choices":[{"message":{"content":"..."}}],"usage":{"total_tokens":214}}
14
LLM providers
$0
Per-request fees
7/10
OWASP LLM Top 10 covered
<2min
Zero to first request
Works With

One drop-in OpenAI-compatible endpoint in front of 14 LLM providers. Built on the open standards your platform team already runs.

OpenAI
Anthropic
Google Gemini
AWS Bedrock
Azure OpenAI
Mistral
Cohere
Groq
Ollama
Qwen
DeepSeek
Moonshot
ChatGLM
Grok
Built on Open Standards
OpenAI APIModel Context ProtocolOpenTelemetryPrometheusHelm + KubernetesOWASP LLM Top 10SOC 2 / HIPAA / GDPR evidenceOIDC / SAML 2.0
Capabilities dashboard illustration

Everything You Need to Govern AI at Scale

Policy-as-Code, immutable audit, PII redaction, and cost control on every plan; agentic AI and MCP governance on Growth — from your first API call to enterprise audit and compliance.

🛡
Policy-as-Code Engine

YAML DSL for model allowlists, token limits, MCP tool restrictions, budget-based downgrades, and time-of-day rules. Dry-run mode before activation. Draft → Active → Shadow lifecycle with versioning and rollback.

🚨
Guardrails & Safety

In-process PII detection (regex always-on + Phileas opt-in, 22 filter types) and prompt-injection / jailbreak detection on every call. Content filters, output-schema enforcement, and grounding/hallucination checks round out coverage of 7 of the 10 OWASP LLM Top 10 categories (see below). Optional ML classifiers (Lakera, Shield Gemini) and HTTP plugins.

🔏
Immutable Audit Trail

Every request HMAC-SHA256 signed with hash-chain integrity verification. Tamper detection built in. SOC2, HIPAA, and GDPR evidence packages generated on demand.

🤖
Agentic AI + MCP ProxyGrowth

MCP server registry and governed tool-call proxy with PII scanning on arguments and responses, human approval gates, and credential centralisation. Agent loop detection with auto-kill, session tracking across LLM and MCP calls, full session timeline.

🔐
Multi-Tenant + BYOK

Row-level tenant isolation on every query. Each tenant brings their own provider keys (AES-256-GCM at rest or vault-reference). Strict-BYOK rejects platform fallback when enabled. 3-level config hierarchy: global → tenant → API key.

💰
FinOps & Cost Control

Real-time per-request cost calculation. Budget caps per tenant, team, or API key. Automatic model downgrade at soft limit. Chargeback reports and cost forecasting.

🔄
Multi-Provider Unified API

Drop-in OpenAI-compatible endpoint. Route to OpenAI, Anthropic, Gemini, Azure, Bedrock, Ollama and 8 more with zero code changes. Streaming, structured outputs, and vision built-in.

Intelligent Routing

Round-robin, weighted, capability-aware, latency-aware, and cost-aware routing. Circuit breakers, failover, and retry with exponential backoff. Canary and A/B testing.

🔍
Full Observability

Prometheus metrics, OpenTelemetry distributed tracing, structured JSON logging. Reference Grafana dashboards in dvara-examples. Unified traces spanning LLM turns and MCP tool calls.

Route planning illustration

Two Governed Data Planes. One Control Plane.

The LLM Gateway sits at the edge for model traffic. The MCP Proxy sits inside your perimeter for tool calls. Both share the same policies, audit trail, and API keys.

LLM Data Plane
</>
Your App
Agent / API Client
{}
LLM Gateway
:8080
AI
Providers
OpenAI · Anthropic · Gemini
Control Plane
PoliciesAuthAuditBudgetsPII
MCP Data Plane
</>
Your App
Agent / IDE
{}
MCP Proxy
:8070
DB
Tools
DB · FS · Slack · APIs
Performance illustration

Built for Production Scale

Governance with near-zero performance tax. Every layer is optimised for throughput — lightweight concurrency, tamper-evident audit, multi-layer caching, and zero-copy streaming.

Single-digitms
added gateway latency
Policy checks run in single-digit milliseconds in typical configurations. Governance adds near-zero overhead to your AI calls; exact numbers vary by enabled guardrails and your provider mix.
Tamper-evident
audit chain
Every event HMAC-SHA256 signed and hash-chained on the response path. Chain continuity is verifiable end-to-end — any gap or alteration is detectable.
Active-active
multi-region
Run multiple regions active-active with automatic failover. Data residency is preserved during failover — EU tenant traffic stays in the EU.
Lightweight Concurrency
Virtual-thread concurrency handles thousands of concurrent connections per node with minimal memory per request. No thread pool tuning required.
Zero-Copy Streaming
SSE tokens flushed immediately with no buffering or rewriting. First token arrives the instant the provider emits it.
Tamper-Evident Audit
Every event HMAC-SHA256 signed and hash-chained on the response path. Append-only by application invariant; chain continuity verifiable end-to-end.
Multi-Layer Caching
In-memory distributed cache delivers sub-millisecond policy, auth, and config lookups. Automatic invalidation on every mutation — fleet-wide consistency in seconds, not minutes.
Automatic Failover
Per-provider health monitoring with instant failover on degradation. Requests reroute seamlessly — your users never notice.
Stateless Data Plane
Restart any pod without coordination — durable state lives in Postgres; the hot-path cache replicates across the cluster. Lightweight database footprint per pod; scale horizontally behind your HPA.

Governance Is the Architecture, Not a Feature

In every competitor, governance is bolted onto a proxy. In DVARA, governance is the design constraint that shaped every layer.

D1
Policy-as-Code with Dry-Run
Competitors: config flags or basic allowlists.

Full YAML DSL with version control, conflict detection at load time, and dry-run mode that evaluates the policy against a provided context before activation. Your auditors get proof that policies were tested before deployment.

D2
Immutable, HMAC-Signed Audit
Competitors: log to stdout or a mutable database.

Every record HMAC-SHA256 signed and hash-chained on the response path. Chain continuity is verifiable end-to-end — any gap or alteration is detectable. Append-only by application invariant, not by hopeful convention.

D3
Tamper-Evident Audit, Down to the MCP Tool Call
Competitors: MCP tool calls land in a mutable log, separate from LLM audit.

Every MCP tool call is HMAC-SHA256 signed and hash-chained into the same append-only audit as your LLM traffic — one tamper-evident record across model calls and tool calls, policy-checked and PII-scanned on a single control plane.

D4
Agentic AI Governance
Competitors: nothing meaningful for agent loops.

Full OpenTelemetry trace spanning LLM turns + MCP calls. Agent loop detection with auto-kill. Human-in-the-loop approval for high-risk tool calls. Session-level cost and compliance summary.

Compliance analysis illustration

Audit-Ready from Day One

Generate compliance evidence packages on demand. Every request is immutably logged, every policy decision recorded, every PII event tracked.

SOC 2 evidence
HIPAA evidence
GDPR evidence
The immutable audit log in DVARA Flightdeck — signed, hash-chained events filterable by event type, actor, and tenantThe immutable audit log in DVARA Flightdeck — signed, hash-chained events filterable by event type, actor, and tenant
The immutable audit log in DVARA Flightdeck — filter by event type, actor, or tenant, and export SOC 2, HIPAA, or GDPR evidence.
Immutable Audit Trail
HMAC-SHA256 signed records with hash-chain integrity verification. Tamper detection built in.
Scheduled Reports
Weekly or monthly compliance report delivery in PDF and JSON.
PII Detection Log
Every redaction event logged with action taken, pattern matched, and tenant context.
Policy Decision Records
Every request logs ALLOW/DENY with rule ID and reason. Full dry-run history.
Right to Erasure
GDPR tenant data purge pipeline with confirmation and audit trail.
Data Residency
Per-tenant region pinning. EU tenant traffic never leaves EU — enforced during failover.

OWASP LLM Top 10 Coverage

DVARA addresses 7 of the 10 OWASP categories at the gateway layer, with 2 partial-coverage items and 1 outside the gateway boundary. Every detection writes a signed audit event you can hand to your auditor.

LLM01
Prompt Injection
Full coverage

Pre-filter pipeline with regex patterns + ML classifier plugins (Lakera, Shield Gemini). Per-tenant custom injection patterns. Detected requests blocked or flagged with policy decision recorded.

LLM02
Sensitive Information Disclosure
Full coverage

In-process PII detection (regex always-on + Phileas opt-in, 22 filter types). BLOCK / REDACT / LOG actions per tenant. Response scanning before delivery. Reversible tokenisation for downstream re-identification.

LLM03
Supply Chain
Partial

BYOK credential model with AES-256-GCM at rest, vault integration (HashiCorp / AWS / Azure), rotation with grace period, and audit of every credential resolution. Provider model selection is operator-controlled, not auto-pulled.

LLM04
Data and Model Poisoning
Out of scope

Out of scope at the gateway layer — training-time concern handled by the model provider. DVARA enforces provider allowlists so unvetted models cannot be used.

LLM05
Improper Output Handling
Full coverage

Output schema validation (JSON Schema enforcement across providers), structured-output translation, post-filter pipeline for XSS / SQLi / SSRF patterns, and response sanitisation before delivery.

LLM06
Excessive Agency
Full coverage

MCP tool-call proxy with per-tool policy rules, human approval gates enforced at execution time, agent loop detection with auto-kill, session-level tool catalogue, and per-tool cost attribution.

LLM07
System Prompt Leakage
Full coverage

Guardrail patterns detect prompt-extraction attempts. Output-side scanning catches leaked system prompts before they leave the gateway. Per-tenant denylist for sensitive prompt fragments.

LLM08
Vector and Embedding Weaknesses
Partial

Semantic cache uses tenant-isolated vector stores so a poisoned cache entry from one tenant cannot affect another. Signed audit of every cache hit + miss; cache configurations are operator-managed.

LLM09
Misinformation
Full coverage

Grounding / hallucination detection via embedding-similarity scoring of model output against operator-supplied source documents. Below-threshold claims flagged, logged, or blocked per policy.

LLM10
Unbounded Consumption
Full coverage

Hard + soft budget caps per tenant / team / API key, automatic model downgrade on soft breach, rate limiting, max-tokens enforcement, context-window governance with auto-prune, and request size limits.

How DVARA Compares

Most gateways now cover the basics. DVARA's depth is the gap that remains — tamper-evident signed audit, argument-level MCP policy, PII on tool-call arguments, and agentic kill-switches. We grade DVARA against 10 gateways across 50 capabilities.

Coding assistant illustration

Developers Love It. Compliance Requires It.

Every stakeholder gets exactly what they need from the same platform.

Developer
"I'm calling five LLM providers with five different SDKs and no fallback."

One endpoint. Drop-in compatibility. Route to any provider with automatic failover. Add streaming, rate limiting, and observability in minutes.

Platform Engineer
"12 teams calling LLMs directly. Agents hitting databases with no governance."

Two governed data planes, one control plane. LLM Gateway at the edge, MCP Proxy inside the perimeter. Adopt incrementally — add tool governance when agents go to production.

Compliance Officer
"Auditors want proof of what AI systems were used and that no PII was leaked."

Immutable audit trail with tamper-evident signatures. PII detected and redacted before reaching providers. Compliance evidence packages generated on demand.

CISO
"Developers are calling GPT-4 from their laptops. We have no idea what data is leaving."

Hard policy enforcement at the gateway. PII blocked before it leaves your network. Role-based access to authorised models only. Every request logged and auditable.

CFO / FinOps
"AI spend is $40K/month and we can't tell which team is spending what."

Real-time cost tracking per request. Budget caps that automatically enforce. Monthly chargeback reports. Semantic caching eliminates repeated token spend on duplicate prompts.

CTO
"We need the governance layer we'd build in 18 months — but we need it now."

Two governed data planes — AI at the edge, tools inside the perimeter — managed as one. Production-grade performance. Enterprise-ready from day one.

Flat monthly pricing. No per-request fees.

Plans scale by monthly token volume, not by metering every call. Annual billing includes two months free.

Trial
Free
30 days. No credit card.
Solo
$59/mo
Flat. No per-request overage.
Recommended
Starter
$299/mo
Small teams in production.
Growth
$499/mo
Scale. Adds Agentic AI & MCP.

Compare full plans & what each tier ships →

Questions teams ask before they deploy.

Is DVARA open source?

No — DVARA is a commercial AI governance platform, not open source. You can self-host it in your own infrastructure or run it as a managed service, and every plan starts with a free 30-day trial — no credit card.

Can we self-host DVARA?

Yes. DVARA runs in your own infrastructure as container images, with bring-your-own-key credentials so provider keys and request data never leave your perimeter. A fully managed option is available if you would rather not operate it yourself.

How is DVARA different from LiteLLM?

LiteLLM is an open-source library and proxy for multi-provider routing. DVARA is an AI governance platform — Policy-as-Code, immutable signed audit, PII and injection guardrails, and argument-level MCP tool-call governance are built into the request path, not bolted on. See the full comparison →

What is the latency overhead?

Governance runs in-process on the request path — there is no extra network hop to a separate policy service. Policy checks add single-digit milliseconds in typical configurations, the tenant lookup resolves in under a millisecond, and optional caches add no overhead when disabled. The Flightdeck dashboard reports live P95 latency for your own traffic.

Which LLM providers are supported?

14 today — OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, Mistral, Cohere, Groq, and more — behind one OpenAI-compatible endpoint, plus self-hosted models via Ollama.

Do we need to change our application code?

No. DVARA is drop-in OpenAI-compatible — point any OpenAI SDK or tool at the gateway URL and governance applies on the first call. Tool calls from AI agents are governed the same way through the MCP Proxy.

The evidence your auditor asks for — built into every request, not bolted on after.

Tamper-evident audit
Every LLM and MCP call is HMAC-signed and hash-chained into an append-only trail.
Compliance evidence on demand
Generate SOC 2, HIPAA, and GDPR reports from real traffic.
OWASP LLM Top 10
7 of 10 risks covered — the remaining gaps documented in the open.
Your perimeter, your keys
Self-host with bring-your-own-key. Credentials and data never leave your control.

Preparing for a SOC 2 audit? See the evidence DVARA generates →

See It in Action.
Start Your Free Trial Today.

Full access for 30 days. No credit card. Deploy in under 2 minutes and see governance working on your first request.