Production Security Checklist
This guide covers the essential security hardening steps before deploying Dvara Gateway to production. Each item includes the relevant configuration property and recommended action.
1. Audit HMAC Secret
The audit subsystem signs every event with HMAC-SHA256 to guarantee tamper-proof integrity and hash-chain linking. The default value (default-dev-secret-change-in-production) is intended for development only.
Action: Generate a cryptographically strong random secret (minimum 32 bytes) and set it via environment variable.
# Generate a 256-bit random secret
openssl rand -base64 32
# Set in your deployment
export GATEWAY_AUDIT_HMAC_SECRET="<generated-secret>"
Property: gateway.audit.hmac-secret
An attacker who knows the default secret can forge audit events and break chain integrity verification, undermining compliance reports (SOC2, HIPAA, GDPR).
2. Enterprise License Key
The enterprise license key unlocks all enterprise modules (policy engine, PII detection, guardrails, FinOps, agentic governance, semantic cache, RBAC, and more). Without it, the gateway runs without license with no-op defaults.
Action: Obtain a signed JWT license key and set it at startup.
export GATEWAY_ENTERPRISE_LICENSE_KEY="<signed-jwt-token>"
Property: gateway.enterprise.license-key
3. OIDC/JWT Authentication
By default, all admin endpoints are unauthenticated. Enabling security requires an OIDC-compliant identity provider (Keycloak, Auth0, Okta, Azure AD, etc.).
Action: Enable security and configure your OIDC issuer.
gateway:
security:
enabled: true
oidc:
issuer-uri: https://your-idp.example.com/realms/dvara
audience: dvara-gateway
role-claim: realm_access.roles # adjust for your IdP
tenant-claim: tenant_id
rbac:
enabled: true # enforce URL-pattern RBAC
session:
timeout-seconds: 3600
Environment variables:
GATEWAY_SECURITY_ENABLED=trueGATEWAY_OIDC_ISSUER_URI=https://your-idp.example.com/realms/dvaraGATEWAY_OIDC_AUDIENCE=dvara-gateway
Anyone with network access can read and modify tenants, routes, policies, budgets, and audit logs.
4. Vault-Backed Secret Management
Storing API keys and credentials in environment variables is acceptable for development but not for production. Dvara supports three vault backends.
Action: Configure one of the supported vault backends.
HashiCorp Vault
export GATEWAY_VAULT_BACKEND=hashicorp
export VAULT_ADDR=https://vault.internal:8200
export VAULT_AUTH_METHOD=approle
export VAULT_ROLE_ID=<role-id>
export VAULT_SECRET_ID=<secret-id>
export VAULT_SECRET_PATH=secret/data/dvara
AWS Secrets Manager
export GATEWAY_VAULT_BACKEND=aws-secrets-manager
export AWS_VAULT_REGION=us-east-1
export AWS_SECRET_NAME=dvara/provider-credentials
Azure Key Vault
export GATEWAY_VAULT_BACKEND=azure-key-vault
export AZURE_VAULT_URL=https://dvara-kv.vault.azure.net
export AZURE_CLIENT_ID=<client-id>
export AZURE_CLIENT_SECRET=<client-secret>
export AZURE_TENANT_ID=<tenant-id>
Property: gateway.vault.backend
Provider API keys live in plaintext environment variables, which may be exposed via process inspection, container metadata, or log leaks.
5. IP Access Control
Restrict which IP addresses and CIDR ranges can access the gateway.
Action: Enable IP access control and configure allowlists/denylists.
gateway:
ip-access:
enabled: true
scope: all # 'all' or 'data-plane'
global-allowlist:
- 10.0.0.0/8
- 172.16.0.0/12
global-denylist:
- 0.0.0.0/0 # deny all not in allowlist
Per-tenant overrides are configured via tenant metadata:
ip-access.allowlist(comma-separated CIDRs)ip-access.denylist(comma-separated CIDRs)
Environment variable: GATEWAY_IP_ACCESS_ENABLED=true
The gateway is accessible from any IP address, increasing exposure to brute-force and reconnaissance attacks.
6. TLS 1.3 Enforcement
Dvara can enforce TLS 1.3 on all outbound connections to LLM providers and optionally apply mTLS with client certificates.
Action: Ensure TLS 1.3 enforcement is enabled (it is by default).
gateway:
tls:
enforce-tls13: true
Environment variable: GATEWAY_TLS_ENFORCE_TLS13=true
Connections to providers may negotiate weaker TLS versions vulnerable to downgrade attacks.
7. PII Detection and Enforcement
Configure PII scanning to prevent sensitive data (SSN, credit cards, emails, etc.) from reaching LLM providers.
Action: Set the default PII action to REDACT or BLOCK.
gateway:
pii:
enabled: true
default-action: REDACT # BLOCK, REDACT, or LOG
scan-responses: true # detect PII in LLM responses
strip-before-cache: true # redact PII before semantic caching
token-encryption-password: "<strong-password>"
Per-tenant overrides via tenant metadata:
pii.enabled=truepii.action=REDACTpii.custom-patterns(Map of label to regex)
Environment variable: GATEWAY_PII_DEFAULT_ACTION=REDACT
PII reaches LLM providers and may be stored in their training data or logs.
8. Budget Caps
Prevent runaway costs by configuring per-tenant and per-API-key budget caps.
Action: Create budget caps via the admin API or UI.
# Create a monthly budget cap for a tenant
curl -X POST http://localhost:8080/admin/v1/budgets \
-H "Content-Type: application/json" \
-d '{
"tenantId": "tenant-prod",
"name": "Monthly limit",
"period": "MONTHLY",
"limitUsd": 5000.00,
"softLimitPct": 80,
"enabled": true
}'
Configure automatic model downgrade on soft limit breach via tenant metadata:
cost.downgrade-threshold-pct=80cost.downgrade-rules=gpt-4o:gpt-4o-mini,claude-3-opus:claude-3-sonnet
A misconfigured client or prompt injection attack can generate unbounded LLM costs.
9. Webhook Alerting
Configure webhooks to receive real-time notifications for security and operational events.
Action: Create webhooks for critical event types.
curl -X POST http://localhost:8080/admin/v1/webhooks \
-H "Content-Type: application/json" \
-d '{
"name": "Security alerts",
"url": "https://your-siem.example.com/webhooks/dvara",
"secret": "<webhook-signing-secret>",
"eventTypes": [
"POLICY_DENIAL",
"PII_DETECTED",
"IP_ACCESS_DENIED",
"INJECTION_DETECTED",
"BUDGET_CAP_HARD",
"AGENT_LOOP_DETECTED",
"GUARDRAIL_BLOCKED",
"COST_ANOMALY"
],
"status": "ACTIVE"
}'
Properties:
gateway.webhooks.enabled=truegateway.webhooks.max-retries=3gateway.webhooks.delivery-timeout-ms=5000
Security events go unnoticed until the next manual audit review.
10. RBAC Role Assignments
Review and restrict role assignments. Follow the principle of least privilege.
Recommended role mapping:
| Team | Role | Permissions |
|---|---|---|
| Platform team | org-admin | Full access (use sparingly) |
| Security/compliance | policy-admin | Policy, PII, guardrail, audit management |
| Finance | billing-admin | Pricing, costs, budgets, chargeback reports |
| Engineering | developer | Routes, API keys, read-only for most resources |
| Stakeholders | viewer | Read-only access to all resources |
Action: Audit current user roles and remove unnecessary org-admin assignments.
# List all users
curl http://localhost:8080/admin/v1/users
# Update roles for a user
curl -X PUT http://localhost:8080/admin/v1/users/{id}/roles \
-H "Content-Type: application/json" \
-d '{"roles": ["developer"]}'
Over-privileged users can modify policies, delete audit logs, or change budget caps.
11. Rate Limiting
Enable rate limiting to protect against abuse and ensure fair usage across tenants.
Action: The enterprise rate limiter uses Bucket4j with Redis. Ensure Redis is provisioned and rate limiting is configured per tenant or API key.
Rate limits are enforced by RateLimitServletFilter on all /v1/* data-plane paths. MCP proxy has McpRateLimitFilter at order 200.
A single client can monopolize gateway capacity, causing denial of service for other tenants.
12. API Key Rotation
Regularly rotate API keys to limit the blast radius of a compromised key.
Action: Use the key rotation endpoint and update clients.
# Rotate an API key (returns new key, old key is revoked)
curl -X POST http://localhost:8080/admin/v1/tenants/{tid}/keys/{kid}/rotate
Recommended cadence: Every 90 days, or immediately if a compromise is suspected.
For MCP server credentials:
# Rotate MCP server credentials
curl -X POST http://localhost:8080/admin/v1/mcp/servers/{id}/credentials/rotate \
-H "Content-Type: application/json" \
-d '{"newCredentialRef": "secret/data/dvara/mcp/new-cred"}'
A leaked API key provides indefinite access until manually revoked.
13. SIEM Export
Forward audit events to your Security Information and Event Management (SIEM) system for centralized monitoring and alerting.
Action: Configure SIEM export. The built-in LoggingSiemExporter writes JSON to the siem.export logger, which can be routed to Splunk, Elasticsearch, or CloudWatch via log shipping.
For direct integration, configure Splunk HEC or CloudWatch exporters (requires custom bean registration).
Ensure the following audit event types are monitored:
POLICY_DENIED-- blocked requestsPII_DETECTED/PII_REDACTED-- sensitive data handlingIP_ACCESS_DENIED-- unauthorized access attemptsAUTHORIZATION_DENIED-- RBAC violationsINJECTION_DETECTED-- prompt injection attemptsGUARDRAIL_BLOCKED-- content policy violationsBUDGET_CAP_HARD-- budget overrunsAGENT_LOOP_DETECTED-- runaway agent sessionsCONFIG_SYNC_FAILURE-- control plane connectivity issues
Security incidents are only visible in local logs, which may be lost or tampered with.
14. mTLS for Provider Connections
Configure mutual TLS (client certificates) for outbound connections to LLM providers, especially in regulated environments.
Action: Configure per-provider mTLS settings.
gateway:
tls:
enforce-tls13: true
providers:
openai:
mtls-enabled: true
client-cert-path: /etc/dvara/certs/openai-client.pem
client-key-path: /etc/dvara/certs/openai-client-key.pem
trust-store-path: /etc/dvara/certs/openai-truststore.p12
trust-store-password: "${OPENAI_TRUSTSTORE_PASSWORD}"
anthropic:
mtls-enabled: true
client-cert-path: /etc/dvara/certs/anthropic-client.pem
client-key-path: /etc/dvara/certs/anthropic-client-key.pem
Provider connections use one-way TLS only; the provider cannot verify the gateway's identity.
15. Monitoring with Prometheus and Grafana
Set up comprehensive monitoring using the built-in Prometheus metrics endpoint.
Action: Configure Prometheus to scrape the gateway and set up Grafana dashboards.
Prometheus scrape config
scrape_configs:
- job_name: dvara-gateway
metrics_path: /actuator/prometheus
static_configs:
- targets: ['gateway:8080']
- job_name: dvara-mcp-proxy
metrics_path: /actuator/prometheus
static_configs:
- targets: ['mcp-proxy:8070']
Key metrics to alert on
| Metric | Condition | Severity |
|---|---|---|
gateway_provider_errors_total | Rate > 10/min | Warning |
gateway_latency_seconds (P99) | > 5s sustained | Warning |
gateway_budget_blocked_total | Any increment | Info |
gateway_guardrail_blocked_total | Rate > 5/min | Critical |
gateway_policy_shadow_divergence_total | Any increment | Info |
mcp_agent_loop_detected_total | Any increment | Warning |
gateway_config_sync_failures_total | Any increment | Critical |
gateway_cost_anomaly_total | Any increment | Warning |
OpenTelemetry distributed tracing
export OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318/v1/traces
export TRACING_SAMPLING_PROBABILITY=0.1 # 10% sampling in production
Degraded performance, provider outages, and security events go undetected until users report issues.
Quick Reference: Minimum Production Configuration
The following environment variables represent the minimum set for a secure production deployment:
# Enterprise license
GATEWAY_ENTERPRISE_LICENSE_KEY=<jwt-token>
# Audit integrity
GATEWAY_AUDIT_HMAC_SECRET=<random-32-byte-base64>
# Authentication
GATEWAY_SECURITY_ENABLED=true
GATEWAY_OIDC_ISSUER_URI=https://your-idp.example.com/realms/dvara
GATEWAY_OIDC_AUDIENCE=dvara-gateway
# Secrets management
GATEWAY_VAULT_BACKEND=hashicorp # or aws-secrets-manager, azure-key-vault
# IP access control
GATEWAY_IP_ACCESS_ENABLED=true
# PII protection
GATEWAY_PII_DEFAULT_ACTION=REDACT
# TLS
GATEWAY_TLS_ENFORCE_TLS13=true
# Observability
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318/v1/traces
TRACING_SAMPLING_PROBABILITY=0.1