What is Dvara?
Dvara is an AI gateway that governs both LLM calls and MCP tool calls from a single control plane. Point any OpenAI SDK at Dvara to route requests across OpenAI, Anthropic, Google Gemini, AWS Bedrock, or a local Ollama instance — with automatic failover, policy enforcement, PII protection, cost controls, and a full audit trail. For agentic workloads, the MCP Proxy enforces the same policies on every tool call before it reaches your internal MCP servers.
Key Value Propositions
- Unified API — One OpenAI-compatible endpoint for all providers. Switch models by changing a single field; no SDK or code changes required.
- Automatic Failover — When a provider fails, the gateway retries on a healthy alternative with health tracking and circuit-breaker logic.
- Rate Limiting — Per-key request/token budgets enforced at the gateway layer, with optional Redis-backed distributed limiting.
- Structured Outputs — Send
response_formatonce; the gateway translates to each provider's native mechanism (tool-use rewrite for Anthropic,responseSchemafor Gemini, etc.). - Routing Control — Model-prefix, round-robin, weighted, latency-aware (EWMA), cost-aware, canary A/B testing, shadow traffic, and geo-aware routing with versioned configuration and live rollback.
- MCP Tool Governance — A dedicated proxy for MCP server calls with policy evaluation, PII scanning on arguments and responses, approval gates, loop detection, and session tracking.
- Policy Engine — YAML DSL for deny/warn rules across models, tools, budgets, time-of-day, data residency, and MCP operations. Shadow mode for safe rollout.
- PII Detection & Redaction — 14 built-in regex patterns with Luhn/DEA checksum validation. Per-tenant BLOCK / REDACT / LOG actions with tokenized redaction and de-tokenization.
- Guardrails — Injection detection (32 patterns), content filtering (profanity, violence, competitor mentions), output sanitization (XSS, SQLi, SSRF), and system prompt leak detection.
- FinOps & Cost Management — Per-model pricing, real-time cost attribution, budget caps (daily/weekly/monthly) with soft and hard limits, automatic model downgrade, cost forecasting, anomaly detection, and chargeback reports (PDF/CSV).
- Immutable Audit Trail — Every event is HMAC-SHA256 signed and hash-chained. Compliance report generation for SOC2 Type II, HIPAA, and GDPR.
- Security — OIDC/JWT authentication, fine-grained RBAC (37 permissions, 5 built-in roles), per-provider mTLS, IP allowlist/denylist, and vault integration (HashiCorp, AWS Secrets Manager, Azure Key Vault).
- Semantic Cache — Embedding-based similarity caching to reduce redundant LLM calls and cost.
- Admin Dashboard — Web console for real-time traffic monitoring, provider health, tenant/route/policy management, audit log exploration, and session management.
- Production Persistence — PostgreSQL (31 tables, Flyway migrations) and Redis (distributed caching, rate limiting) for production durability.
Supported Providers
| Provider | Model Prefix | Example Models | Activation |
|---|---|---|---|
| OpenAI | gpt | gpt-4o, gpt-4o-mini | OPENAI_API_KEY |
| Anthropic | claude | claude-sonnet-4-5, claude-3-haiku | ANTHROPIC_API_KEY |
| Gemini | gemini | gemini-2.0-flash, gemini-1.5-pro | GEMINI_API_KEY |
| Bedrock | bedrock/ | bedrock/anthropic.claude-3-sonnet-... | BEDROCK_ENABLED=true |
| Ollama | ollama/ | ollama/llama3.2, ollama/mistral | OLLAMA_ENABLED=true |
| Mock | mock/ | mock/test-model | MOCK_PROVIDER_ENABLED=true |
Providers register automatically when their credentials are set. Missing credentials means the provider is simply not available — no errors at startup.
Architecture Overview
Two data paths, one control plane:
- LLM Gateway (port 8080) — handles
/v1/*traffic. Applications send OpenAI-format requests; the gateway selects a provider based on themodelfield, applies the governance filter chain (policy, PII, guardrails, budget), and returns a normalized response. - MCP Proxy (port 8070) — handles
/mcp/*traffic. AI agents send tool calls through the proxy; it enforces policy, scans arguments for PII, checks approval gates, and forwards to internal MCP servers. The agent never holds MCP server credentials. - Admin UI (port 8090) — Thymeleaf + HTMX dashboard for configuration, monitoring, audit, and session management. Talks to the gateway's admin API.
- Control Plane — embedded in
gateway-server. Distributes configuration to data plane instances via HTTP polling. Manages the fleet registry, audit trail, and all admin APIs.
Both data planes share the same auth tokens, policy DSL, audit infrastructure, PII engine, and budget rules. They are deployed and scaled independently but governed uniformly.
Technology Stack
- Java 21 with Project Loom virtual threads for high-concurrency blocking I/O
- Spring Boot 4 (Spring MVC, not WebFlux)
- Maven multi-module build (17 modules)
- RestClient for all upstream HTTP calls (providers, config sync, webhooks)
- PostgreSQL for enterprise persistence (31 tables, Flyway migrations)
- Redis for distributed caching and rate limiting (enterprise)
- Micrometer + OpenTelemetry for metrics and distributed tracing
What's Next
- Quickstart — Run Dvara locally in under 5 minutes
- Configuration — Environment variables and YAML reference
- Providers — Provider-specific setup and capabilities
- Admin UI — Dashboard walkthrough
- Docker Quickstart — Container deployment
- Kubernetes & Helm — Production deployment