The AI Governance Platform for LLM and MCP Traffic

Works With

One drop-in OpenAI-compatible endpoint in front of 14 LLM providers. Built on the open standards your platform team already runs.

OpenAI

Anthropic

Google Gemini

AWS Bedrock

Azure OpenAI

Mistral

Cohere

Groq

Ollama

Qwen

DeepSeek

Moonshot

ChatGLM

Grok

Drops Into Your Framework

OpenAI SDK →LangChain →Spring AI →Vercel AI SDK →Pydantic AI →

Built on Open Standards

OpenAI APIModel Context ProtocolOpenTelemetryPrometheusHelm + KubernetesOWASP LLM Top 10SOC 2 / HIPAA / GDPR evidenceOIDC / SAML 2.0

Capabilities

Everything You Need to Govern AI at Scale

Policy-as-Code, immutable audit, PII redaction, and cost control on every plan; agentic AI and MCP governance on Growth — from your first API call to enterprise audit and compliance.

🛡

Policy-as-Code Engine

YAML DSL for model allowlists, token limits, MCP tool restrictions, budget-based downgrades, and time-of-day rules. Dry-run mode before activation. Draft → Active → Shadow lifecycle with versioning and rollback.

🚨

Guardrails & Safety

In-process PII detection (regex always-on + Phileas opt-in, 22 filter types) and prompt-injection / jailbreak detection on every call. Content filters, output-schema enforcement, and grounding/hallucination checks round out coverage of 7 of the 10 OWASP LLM Top 10 categories (see below). Optional ML classifiers (Lakera, Shield Gemini) and HTTP plugins.

🔏

Immutable Audit Trail

Every request HMAC-SHA256 signed with hash-chain integrity verification. Tamper detection built in. SOC2, HIPAA, and GDPR evidence packages generated on demand.

🤖

Agentic AI + MCP ProxyGrowth

MCP server registry and governed tool-call proxy with PII scanning on arguments and responses, human approval gates, and credential centralisation. Agent loop detection with auto-kill, session tracking across LLM and MCP calls, full session timeline.

🔐

Multi-Tenant + BYOK

Row-level tenant isolation on every query. Each tenant brings their own provider keys (AES-256-GCM at rest or vault-reference). Strict-BYOK rejects platform fallback when enabled. 3-level config hierarchy: global → tenant → API key.

💰

FinOps & Cost Control

Real-time per-request cost calculation. Budget caps per tenant, team, or API key. Automatic model downgrade at soft limit. Chargeback reports and cost forecasting.

🔄

Multi-Provider Unified API

Drop-in OpenAI-compatible endpoint. Route to OpenAI, Anthropic, Gemini, Azure, Bedrock, Ollama and 8 more with zero code changes. Streaming, structured outputs, and vision built-in.

⚡

Intelligent Routing

Round-robin, weighted, capability-aware, latency-aware, and cost-aware routing. Circuit breakers, failover, and retry with exponential backoff. Canary and A/B testing.

🔍

Full Observability

Prometheus metrics, OpenTelemetry distributed tracing, structured JSON logging. Reference Grafana dashboards in dvara-examples. Unified traces spanning LLM turns and MCP tool calls.

Architecture

Two Governed Data Planes. One Control Plane.

The LLM Gateway sits at the edge for model traffic. The MCP Proxy sits inside your perimeter for tool calls. Both share the same policies, audit trail, and API keys.

LLM Data Plane

</>

Your App

Agent / API Client

{ }

LLM Gateway

:8080

Providers

OpenAI · Anthropic · Gemini

Control Plane

PoliciesAuthAuditBudgetsPII

MCP Data Plane

</>

Your App

Agent / IDE

{ }

MCP Proxy

:8070

Tools

DB · FS · Slack · APIs

Performance

Built for Production Scale

Governance with near-zero performance tax. Every layer is optimised for throughput — lightweight concurrency, tamper-evident audit, multi-layer caching, and zero-copy streaming.

Single-digitms

added gateway latency

Policy checks run in single-digit milliseconds in typical configurations. Governance adds near-zero overhead to your AI calls; exact numbers vary by enabled guardrails and your provider mix.

Tamper-evident

audit chain

Every event HMAC-SHA256 signed and hash-chained on the response path. Chain continuity is verifiable end-to-end — any gap or alteration is detectable.

Active-active

multi-region

Run multiple regions active-active with automatic failover. Data residency is preserved during failover — EU tenant traffic stays in the EU.

Lightweight Concurrency

Virtual-thread concurrency handles thousands of concurrent connections per node with minimal memory per request. No thread pool tuning required.

Zero-Copy Streaming

SSE tokens flushed immediately with no buffering or rewriting. First token arrives the instant the provider emits it.

Tamper-Evident Audit

Every event HMAC-SHA256 signed and hash-chained on the response path. Append-only by application invariant; chain continuity verifiable end-to-end.

Multi-Layer Caching

In-memory distributed cache delivers sub-millisecond policy, auth, and config lookups. Automatic invalidation on every mutation — fleet-wide consistency in seconds, not minutes.

Automatic Failover

Per-provider health monitoring with instant failover on degradation. Requests reroute seamlessly — your users never notice.

Stateless Data Plane

Restart any pod without coordination — durable state lives in Postgres; the hot-path cache replicates across the cluster. Lightweight database footprint per pod; scale horizontally behind your HPA.

Why DVARA

Governance Is the Architecture, Not a Feature

In every competitor, governance is bolted onto a proxy. In DVARA, governance is the design constraint that shaped every layer.

Policy-as-Code with Dry-Run

Competitors: config flags or basic allowlists.

Full YAML DSL with version control, conflict detection at load time, and dry-run mode that evaluates the policy against a provided context before activation. Your auditors get proof that policies were tested before deployment.

Immutable, HMAC-Signed Audit

Competitors: log to stdout or a mutable database.

Every record HMAC-SHA256 signed and hash-chained on the response path. Chain continuity is verifiable end-to-end — any gap or alteration is detectable. Append-only by application invariant, not by hopeful convention.

Tamper-Evident Audit, Down to the MCP Tool Call

Competitors: MCP tool calls land in a mutable log, separate from LLM audit.

Every MCP tool call is HMAC-SHA256 signed and hash-chained into the same append-only audit as your LLM traffic — one tamper-evident record across model calls and tool calls, policy-checked and PII-scanned on a single control plane.

Agentic AI Governance

Competitors: nothing meaningful for agent loops.

Full OpenTelemetry trace spanning LLM turns + MCP calls. Agent loop detection with auto-kill. Human-in-the-loop approval for high-risk tool calls. Session-level cost and compliance summary.

Compliance

Audit-Ready from Day One

Generate compliance evidence packages on demand. Every request is immutably logged, every policy decision recorded, every PII event tracked.

SOC 2 evidence

HIPAA evidence

GDPR evidence

The immutable audit log in DVARA Flightdeck — signed, hash-chained events filterable by event type, actor, and tenant — The immutable audit log in DVARA Flightdeck — filter by event type, actor, or tenant, and export SOC 2, HIPAA, or GDPR evidence.

Immutable Audit Trail

HMAC-SHA256 signed records with hash-chain integrity verification. Tamper detection built in.

Scheduled Reports

Weekly or monthly compliance report delivery in PDF and JSON.

PII Detection Log

Every redaction event logged with action taken, pattern matched, and tenant context.

Policy Decision Records

Every request logs ALLOW/DENY with rule ID and reason. Full dry-run history.

Right to Erasure

GDPR tenant data purge pipeline with confirmation and audit trail.

Data Residency

Per-tenant region pinning. EU tenant traffic never leaves EU — enforced during failover.

Security Coverage

OWASP LLM Top 10 Coverage

DVARA addresses 7 of the 10 OWASP categories at the gateway layer, with 2 partial-coverage items and 1 outside the gateway boundary. Every detection writes a signed audit event you can hand to your auditor.

LLM01

Prompt Injection

Full coverage

Pre-filter pipeline with regex patterns + ML classifier plugins (Lakera, Shield Gemini). Per-tenant custom injection patterns. Detected requests blocked or flagged with policy decision recorded.

LLM02

Sensitive Information Disclosure

Full coverage

In-process PII detection (regex always-on + Phileas opt-in, 22 filter types). BLOCK / REDACT / LOG actions per tenant. Response scanning before delivery. Reversible tokenisation for downstream re-identification.

LLM03

Supply Chain

Partial

BYOK credential model with AES-256-GCM at rest, vault integration (HashiCorp / AWS / Azure), rotation with grace period, and audit of every credential resolution. Provider model selection is operator-controlled, not auto-pulled.

LLM04

Data and Model Poisoning

Out of scope

Out of scope at the gateway layer — training-time concern handled by the model provider. DVARA enforces provider allowlists so unvetted models cannot be used.

LLM05

Improper Output Handling

Full coverage

Output schema validation (JSON Schema enforcement across providers), structured-output translation, post-filter pipeline for XSS / SQLi / SSRF patterns, and response sanitisation before delivery.

LLM06

Excessive Agency

Full coverage

MCP tool-call proxy with per-tool policy rules, human approval gates enforced at execution time, agent loop detection with auto-kill, session-level tool catalogue, and per-tool cost attribution.

LLM07

System Prompt Leakage

Full coverage

Guardrail patterns detect prompt-extraction attempts. Output-side scanning catches leaked system prompts before they leave the gateway. Per-tenant denylist for sensitive prompt fragments.

LLM08

Vector and Embedding Weaknesses

Partial

Semantic cache uses tenant-isolated vector stores so a poisoned cache entry from one tenant cannot affect another. Signed audit of every cache hit + miss; cache configurations are operator-managed.

LLM09

Misinformation

Full coverage

Grounding / hallucination detection via embedding-similarity scoring of model output against operator-supplied source documents. Below-threshold claims flagged, logged, or blocked per policy.

LLM10

Unbounded Consumption

Full coverage

Hard + soft budget caps per tenant / team / API key, automatic model downgrade on soft breach, rate limiting, max-tokens enforcement, context-window governance with auto-prune, and request size limits.

Reference: OWASP LLM Top 10 (genai.owasp.org/llm-top-10)

Comparison

How DVARA Compares

Most gateways now cover the basics. DVARA's depth is the gap that remains — tamper-evident signed audit, argument-level MCP policy, PII on tool-call arguments, and agentic kill-switches. We grade DVARA against 10 gateways across 50 capabilities.

See how DVARA compares across 10 gateways →

Built For Your Team

Developers Love It. Compliance Requires It.

Every stakeholder gets exactly what they need from the same platform.

Developer

"I'm calling five LLM providers with five different SDKs and no fallback."

One endpoint. Drop-in compatibility. Route to any provider with automatic failover. Add streaming, rate limiting, and observability in minutes.

Platform Engineer

"12 teams calling LLMs directly. Agents hitting databases with no governance."

Two governed data planes, one control plane. LLM Gateway at the edge, MCP Proxy inside the perimeter. Adopt incrementally — add tool governance when agents go to production.

Compliance Officer

"Auditors want proof of what AI systems were used and that no PII was leaked."

Immutable audit trail with tamper-evident signatures. PII detected and redacted before reaching providers. Compliance evidence packages generated on demand.

CISO

"Developers are calling GPT-4 from their laptops. We have no idea what data is leaving."

Hard policy enforcement at the gateway. PII blocked before it leaves your network. Role-based access to authorised models only. Every request logged and auditable.

CFO / FinOps

"AI spend is $40K/month and we can't tell which team is spending what."

Real-time cost tracking per request. Budget caps that automatically enforce. Monthly chargeback reports. Semantic caching eliminates repeated token spend on duplicate prompts.

CTO

"We need the governance layer we'd build in 18 months — but we need it now."

Two governed data planes — AI at the edge, tools inside the perimeter — managed as one. Production-grade performance. Enterprise-ready from day one.

Pricing

Flat monthly pricing. No per-request fees.

Plans scale by monthly token volume, not by metering every call. Annual billing includes two months free.

Trial

Free

30 days. No credit card.

Solo

$59/mo

Flat. No per-request overage.

Recommended

Starter

$299/mo

Small teams in production.

Growth

$499/mo

Scale. Adds Agentic AI & MCP.

Compare full plans & what each tier ships →

FAQ

Questions teams ask before they deploy.

Is DVARA open source?

No — DVARA is a commercial AI governance platform, not open source. You can self-host it in your own infrastructure or run it as a managed service, and every plan starts with a free 30-day trial — no credit card.

Can we self-host DVARA?

Yes. DVARA runs in your own infrastructure as container images, with bring-your-own-key credentials so provider keys and request data never leave your perimeter. A fully managed option is available if you would rather not operate it yourself.

How is DVARA different from LiteLLM?

LiteLLM is an open-source library and proxy for multi-provider routing. DVARA is an AI governance platform — Policy-as-Code, immutable signed audit, PII and injection guardrails, and argument-level MCP tool-call governance are built into the request path, not bolted on. See the full comparison →

What is the latency overhead?

Governance runs in-process on the request path — there is no extra network hop to a separate policy service. Policy checks add single-digit milliseconds in typical configurations, the tenant lookup resolves in under a millisecond, and optional caches add no overhead when disabled. The Flightdeck dashboard reports live P95 latency for your own traffic.

Which LLM providers are supported?

14 today — OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, Mistral, Cohere, Groq, and more — behind one OpenAI-compatible endpoint, plus self-hosted models via Ollama.

Do we need to change our application code?

No. DVARA is drop-in OpenAI-compatible — point any OpenAI SDK or tool at the gateway URL and governance applies on the first call. Tool calls from AI agents are governed the same way through the MCP Proxy.

Built for regulated teams

The evidence your auditor asks for — built into every request, not bolted on after.

Tamper-evident audit

Every LLM and MCP call is HMAC-signed and hash-chained into an append-only trail.

Compliance evidence on demand

Generate SOC 2, HIPAA, and GDPR reports from real traffic.

OWASP LLM Top 10

7 of 10 risks covered — the remaining gaps documented in the open.

Your perimeter, your keys

Self-host with bring-your-own-key. Credentials and data never leave your control.

Preparing for a SOC 2 audit? See the evidence DVARA generates →

See It in Action.
Start Your Free Trial Today.

Full access for 30 days. No credit card. Deploy in under 2 minutes and see governance working on your first request.

Start Free Trial Talk to Sales

Everything You Need to Govern AI at Scale

Two Governed Data Planes. One Control Plane.

Built for Production Scale

Governance Is the Architecture, Not a Feature

Audit-Ready from Day One

OWASP LLM Top 10 Coverage

How DVARA Compares

Developers Love It. Compliance Requires It.

Flat monthly pricing. No per-request fees.

Questions teams ask before they deploy.

Is DVARA open source?

Can we self-host DVARA?

How is DVARA different from LiteLLM?

What is the latency overhead?

Which LLM providers are supported?

Do we need to change our application code?

See It in Action.Start Your Free Trial Today.

See It in Action.
Start Your Free Trial Today.