Configuration Reference
DVARA is configured through application.yml with environment variable overrides. Configuration is partitioned across three namespace families:
- Cross-cutting (
dvara.*) — license, audit, encryption, region, credentials, vault, actuator. Every DVARA app reads them. - LLM Gateway (
dvara.llm-gateway.*) — providers, routes, resilience, rate-limit, cache, PII, guardrails, webhooks, IP access, FinOps, routing strategies. - Flightdeck / MCP-only (
dvara.flightdeck.*anddvara.mcp-gateway.*) — admin UI, OIDC/SAML, compliance, agentic governance, MCP-proxy timeouts. Bound only by the apps that need them.
Spring relaxed binding maps every property to an env var by uppercasing and replacing dots/hyphens with underscores — dvara.llm-gateway.providers.openai.api-key is settable as DVARA_LLM_GATEWAY_PROVIDERS_OPENAI_API_KEY or, more commonly, via the explicit env-var aliases like OPENAI_API_KEY declared in application.yml (see below).
The dvara.flightdeck.saas.* properties (Stripe price IDs, webhook signing secret, checkout URLs) plus the SaaS-funnel dvara.flightdeck.email.* configuration are for the dvarahq.com managed install only. They are intentionally not documented here. Self-hosted and managed-hosting customers do not need them — leave dvara.flightdeck.saas.enabled=false (the default) and the surface stays dark even if the JAR is on the classpath.
Configuration Hierarchy
Configuration is resolved in this order (later sources override earlier ones):
- Defaults — hardcoded in
application.yml application.yml— file-based configuration- Environment variables — override any property (e.g.,
OPENAI_API_KEY) - System properties — JVM
-Dflags bootstrap.yaml— whenDVARA_BOOTSTRAP_FILEis set, seeds tenants, API keys, and routes into the database on first startup (idempotent; re-runs skip rows whose id already exists)- Database (Admin API / Flightdeck writes) — once seeded, ongoing changes to routes, policies, credentials, tenants, budgets, etc. are made through
/v1/admin/*REST endpoints or the Flightdeck UI. These writes hot-reload across the fleet via PostgreSQLNOTIFY config_change; the data plane'sDataPlaneConfigRefresherpicks up the notification and either rebuilds the routing engine (routes), republishes thePolicyMutationEvent(policies), or evicts the relevant cache (credentials, plugins, semantic cache configs). No restart required for any persisted-config change.
Steps 1–4 govern boot-time configuration. Steps 5–6 govern persisted state and shadow the file-based config for the entities they own — once a route exists in the database, the YAML's dvara.llm-gateway.routes: entries no longer apply for that route id.
Full Application Configuration
dvara:
# ── Cross-cutting (every DVARA app reads these) ──
license:
key: ${DVARA_LICENSE_KEY:} # signed DVARA- envelope; required at startup
encryption:
master-password: ${DVARA_ENCRYPTION_MASTER_PASSWORD:}
actuator:
api-key: ${DVARA_ACTUATOR_API_KEY:} # Bearer for /actuator/gateway-status + authenticated actuator paths
metrics-api-key: ${DVARA_ACTUATOR_METRICS_API_KEY:} # Distinct Bearer for /actuator/prometheus
audit:
hmac-secret: ${DVARA_AUDIT_HMAC_SECRET:default-dev-secret-change-in-production}
max-events: ${DVARA_AUDIT_MAX_EVENTS:100000}
store-prompts-by-default: ${DVARA_AUDIT_STORE_PROMPTS:false}
prompt-retention-days: ${DVARA_AUDIT_PROMPT_RETENTION_DAYS:90}
region:
id: ${DVARA_REGION_ID:} # region identity (blank = region-unaware)
name: ${DVARA_REGION_NAME:} # human-readable region name
credentials:
require-tenant-credential: ${DVARA_CREDENTIALS_REQUIRE_TENANT_CREDENTIAL:false}
# ── LLM Gateway ──
llm-gateway:
data-plane:
require-api-key: ${DVARA_LLM_GATEWAY_REQUIRE_API_KEY:false} # set true in production
# ── Provider Configuration ──
providers:
openai:
api-key: ${OPENAI_API_KEY:} # blank = provider disabled
# base-url: https://api.openai.com/v1 # override for Azure / proxy / long-tail OpenAI-compatible upstreams
anthropic:
api-key: ${ANTHROPIC_API_KEY:}
gemini:
api-key: ${GEMINI_API_KEY:}
azure-openai:
api-key: ${AZURE_OPENAI_API_KEY:}
base-url: ${AZURE_OPENAI_BASE_URL:} # required, e.g. https://{resource}.openai.azure.com/openai
mistral:
api-key: ${MISTRAL_API_KEY:}
cohere:
api-key: ${COHERE_API_KEY:}
groq:
api-key: ${GROQ_API_KEY:}
# ── First-class Chinese-region providers ──
qwen:
api-key: ${QWEN_API_KEY:} # Alibaba Dashscope
# base-url defaults to https://dashscope.aliyuncs.com/compatible-mode/v1
deepseek:
api-key: ${DEEPSEEK_API_KEY:}
# base-url defaults to https://api.deepseek.com/v1
moonshot:
api-key: ${MOONSHOT_API_KEY:} # Kimi
# base-url defaults to https://api.moonshot.cn/v1
chatglm:
api-key: ${ZHIPU_API_KEY:} # note: env var is ZHIPU_*, not CHATGLM_*
# base-url defaults to https://open.bigmodel.cn/api/paas/v4
grok:
api-key: ${XAI_API_KEY:} # note: env var is XAI_*, not GROK_*
# base-url defaults to https://api.x.ai/v1
ollama:
enabled: ${OLLAMA_ENABLED:false}
base-url: ${OLLAMA_BASE_URL:http://localhost:11434}
bedrock:
enabled: ${BEDROCK_ENABLED:false}
access-key: ${AWS_ACCESS_KEY_ID:}
secret-key: ${AWS_SECRET_ACCESS_KEY:}
region: ${AWS_REGION:us-east-1}
mock:
enabled: ${MOCK_PROVIDER_ENABLED:false} # dev / CI only — bundles Groovy for scripted matchers
response: "This is a mock response" # static text or `groovy:`-prefixed script
latency-ms: 100 # simulated delay
stream-token-delay-ms: 20 # inter-token SSE delay
error-rate: 0.0 # 0.0–1.0 failure fraction
# ── Route Configuration ──
routes:
- id: load-balance-gpt
model-pattern: "gpt*"
strategy: round-robin
providers:
- provider: openai
- provider: bedrock
- id: weighted-claude
model-pattern: "claude*"
strategy: weighted
providers:
- provider: anthropic
weight: 70
- provider: bedrock
weight: 30
# ── Resilience ──
resilience:
retry:
max-attempts: 3
initial-backoff-ms: 500
backoff-multiplier: 2.0
max-backoff-ms: 10000
circuit-breaker:
failure-rate-threshold: 50
sliding-window-size: 10
minimum-number-of-calls: 5
wait-duration-in-open-state-ms: 30000
permitted-calls-in-half-open: 3
timeout:
chat-timeout-ms: 30000
streaming-timeout-ms: 120000
fallback:
enabled: true
# ── Rate Limiting (always backed by embedded Hazelcast) ──
rate-limit:
enabled: false
per-key:
requests-per-minute: 100 # max requests per API key per 60-second sliding window
tokens-per-minute: 100000 # max tokens per API key per 60-second sliding window
# ── Response Caching (exact-match) ──
cache:
enabled: false
ttl-seconds: 3600
max-size: 10000
# ── Observability ──
management:
endpoints:
web:
exposure:
include: health,prometheus,gateway-status,info # exposed actuator endpoints
exclude: env,heapdump,threaddump,beans,mappings,configprops,loggers,scheduledtasks,caches,sessions,quartz # dangerous endpoints — return 404 regardless of auth
endpoint:
health:
show-details: when-authorized # `always` is rejected at boot; use `when-authorized` or `never`
prometheus:
metrics:
export:
enabled: true # enable Prometheus scrape endpoint
tracing:
sampling:
probability: ${TRACING_SAMPLING_PROBABILITY:1.0} # 0.0–1.0
otlp:
tracing:
endpoint: ${OTEL_EXPORTER_OTLP_ENDPOINT:http://localhost:4318/v1/traces}
# ── JVM and logging ──
spring:
threads:
virtual:
enabled: true # Project Loom virtual threads
# profiles:
# active: log-plain # switch from JSON to plain-text logging
server:
port: 8080
Environment Variable Reference
| Environment Variable | Property | Description |
|---|---|---|
OPENAI_API_KEY | dvara.llm-gateway.providers.openai.api-key | OpenAI API key |
ANTHROPIC_API_KEY | dvara.llm-gateway.providers.anthropic.api-key | Anthropic API key |
GEMINI_API_KEY | dvara.llm-gateway.providers.gemini.api-key | Google Gemini API key |
OLLAMA_ENABLED | dvara.llm-gateway.providers.ollama.enabled | Enable Ollama provider |
OLLAMA_BASE_URL | dvara.llm-gateway.providers.ollama.base-url | Ollama server URL |
BEDROCK_ENABLED | dvara.llm-gateway.providers.bedrock.enabled | Enable AWS Bedrock provider |
AWS_ACCESS_KEY_ID | dvara.llm-gateway.providers.bedrock.access-key | AWS access key for Bedrock |
AWS_SECRET_ACCESS_KEY | dvara.llm-gateway.providers.bedrock.secret-key | AWS secret key for Bedrock |
AWS_REGION | dvara.llm-gateway.providers.bedrock.region | AWS region for Bedrock |
AZURE_OPENAI_API_KEY | dvara.llm-gateway.providers.azure-openai.api-key | Azure OpenAI API key |
AZURE_OPENAI_BASE_URL | dvara.llm-gateway.providers.azure-openai.base-url | Azure OpenAI deployment URL (required) |
MISTRAL_API_KEY | dvara.llm-gateway.providers.mistral.api-key | Mistral API key |
COHERE_API_KEY | dvara.llm-gateway.providers.cohere.api-key | Cohere API key |
GROQ_API_KEY | dvara.llm-gateway.providers.groq.api-key | Groq API key |
QWEN_API_KEY | dvara.llm-gateway.providers.qwen.api-key | Alibaba Qwen / Dashscope API key. Default base URL https://dashscope.aliyuncs.com/compatible-mode/v1. |
DEEPSEEK_API_KEY | dvara.llm-gateway.providers.deepseek.api-key | DeepSeek API key. Default base URL https://api.deepseek.com/v1. |
MOONSHOT_API_KEY | dvara.llm-gateway.providers.moonshot.api-key | Moonshot / Kimi API key. Default base URL https://api.moonshot.cn/v1. |
ZHIPU_API_KEY | dvara.llm-gateway.providers.chatglm.api-key | ChatGLM / Zhipu API key — note the env-var name is ZHIPU_*, not CHATGLM_* (matches the brand's API auth header). Default base URL https://open.bigmodel.cn/api/paas/v4. |
XAI_API_KEY | dvara.llm-gateway.providers.grok.api-key | xAI Grok API key — note the env-var name is XAI_*, not GROK_* (matches xAI's auth convention). Default base URL https://api.x.ai/v1. |
MOCK_PROVIDER_ENABLED | dvara.llm-gateway.providers.mock.enabled | Enable mock provider for testing. Dev / CI only — the Mock provider bundles Groovy for scenario scripting, which means arbitrary code execution inside the gateway JVM. A startup WARN fires whenever Mock is enabled outside the dev / test / ci / local / default Spring profile allow-list. |
DVARA_LLM_GATEWAY_PROVIDERS_MOCK_RESPONSE | dvara.llm-gateway.providers.mock.response | Default mock response text (or groovy:-prefixed script) |
DVARA_LLM_GATEWAY_PROVIDERS_MOCK_LATENCY_MS | dvara.llm-gateway.providers.mock.latency-ms | Simulated latency per call in ms (default: 100) |
DVARA_LLM_GATEWAY_PROVIDERS_MOCK_STREAM_TOKEN_DELAY_MS | dvara.llm-gateway.providers.mock.stream-token-delay-ms | Inter-token delay for streaming (default: 20) |
DVARA_LLM_GATEWAY_PROVIDERS_MOCK_ERROR_RATE | dvara.llm-gateway.providers.mock.error-rate | Probability [0.0, 1.0] of injected PROVIDER_ERROR |
| — | dvara.llm-gateway.providers.mock.matchers | Wiremock-style conditional matchers (YAML list — see Provider Setup → Mock) |
| — | dvara.llm-gateway.providers.mock.scenarios-dir | Directory of *.groovy scenario files with hot reload |
| — | dvara.llm-gateway.providers.mock.console-authoring | Opt in to the DVARA Flightdeck authoring UI + REST API at /v1/admin/mock/scenarios/** (default: false) |
| — | dvara.llm-gateway.providers.mock.audit-sample-rate | Fraction of MOCK_MATCHER_FIRED events to audit (default: 1.0) |
DVARA_REGION_ID | dvara.region.id | Region identity for this gateway instance |
DVARA_REGION_NAME | dvara.region.name | Human-readable region name |
TRACING_SAMPLING_PROBABILITY | management.tracing.sampling.probability | Trace sampling rate (0.0–1.0, default: 1.0) |
OTEL_EXPORTER_OTLP_ENDPOINT | management.otlp.tracing.endpoint | OTLP HTTP endpoint for trace export |
DVARA_FLIGHTDECK_COMPLIANCE_SOC2_SCHEDULE | dvara.flightdeck.compliance.soc2-schedule | Cron expression for scheduled SOC2 reports |
DVARA_FLIGHTDECK_COMPLIANCE_HIPAA_SCHEDULE | dvara.flightdeck.compliance.hipaa-schedule | Cron expression for scheduled HIPAA reports |
DVARA_FLIGHTDECK_COMPLIANCE_GDPR_SCHEDULE | dvara.flightdeck.compliance.gdpr-schedule | Cron expression for scheduled GDPR reports |
DVARA_FLIGHTDECK_COMPLIANCE_DEFAULT_TENANT | dvara.flightdeck.compliance.default-tenant-id | Tenant for scheduled reports (blank = all) |
DVARA_FLIGHTDECK_COMPLIANCE_RETENTION_DAYS | dvara.flightdeck.compliance.retention-days | Report retention period (default: 365) |
DVARA_LLM_GATEWAY_PII_ENABLED | dvara.llm-gateway.pii.enabled | Enable PII detection (default: true) |
DVARA_LLM_GATEWAY_PII_DEFAULT_ACTION | dvara.llm-gateway.pii.default-action | Default PII action: LOG, BLOCK, REDACT |
DVARA_LLM_GATEWAY_PII_SCAN_RESPONSES | dvara.llm-gateway.pii.scan-responses | Scan LLM responses for PII (default: true) |
DVARA_LLM_GATEWAY_PII_STRIP_BEFORE_CACHE | dvara.llm-gateway.pii.strip-before-cache | Redact PII before caching (default: true) |
DVARA_LLM_GATEWAY_PII_TOKEN_ENCRYPTION_PASSWORD | dvara.llm-gateway.pii.token-encryption-password | AES-256-GCM key for PII token encryption |
DVARA_LLM_GATEWAY_PII_MAX_TOKENS_PER_TENANT | dvara.llm-gateway.pii.max-tokens-per-tenant | Max PII tokens per tenant (default: 50000) |
DVARA_LLM_GATEWAY_PII_TOKEN_RETENTION_DAYS | dvara.llm-gateway.pii.token-retention-days | PII token retention period (default: 30) |
DVARA_LICENSE_KEY | dvara.license.key | Signed DVARA license key for enterprise features |
DVARA_ACTUATOR_API_KEY | dvara.actuator.api-key | Shared-secret Bearer required on /actuator/gateway-status and every authenticated actuator path except /actuator/prometheus. Generate with openssl rand -base64 32. Set the same value on the DVARA Flightdeck so it can fetch the rich status payload that powers the License page. The legacy property name gateway.control-plane.api-key (env: GATEWAY_CONTROL_PLANE_API_KEY) is accepted as a one-release fallback with a deprecation WARN; the legacy property name is removed in 1.1.0-GA. The current DVARA_ACTUATOR_API_KEY env var is the canonical name and not subject to removal. |
DVARA_ACTUATOR_METRICS_API_KEY | dvara.actuator.metrics-api-key | Distinct Bearer required only on /actuator/prometheus. Generate independently — a leaked metrics-scrape token must not unlock the license envelope. Configure your Prometheus scrape job's bearer_token_file to point at this value. |
DVARA_LLM_GATEWAY_REQUIRE_API_KEY | dvara.llm-gateway.data-plane.require-api-key | Require a valid API key on every /v1/* request (default: false). When true, requests without a valid Authorization: Bearer header are rejected with 401. |
DVARA_CREDENTIALS_REQUIRE_TENANT_CREDENTIAL | dvara.credentials.require-tenant-credential | Strict-BYOK enforcement (default: false). When true, tenants without their own active provider credential are rejected with TENANT_CREDENTIAL_REQUIRED (HTTP 403) instead of borrowing the platform-default credential. Per-tenant override via Tenant.metadata["credentials.require-tenant-credential"]. |
| — | dvara.llm-gateway.credentials.grace-expiry-initial-delay-ms | Initial delay before the first credential grace-expiry sweep runs after startup (default: 60000). |
| — | dvara.llm-gateway.credentials.grace-expiry-interval-ms | Interval between credential grace-expiry sweeps (default: 60000). Each sweep transitions expired GRACE credentials to SUPERSEDED and emits a CREDENTIAL_GRACE_EXPIRED audit event. |
DVARA_FLIGHTDECK_PLAYGROUND_ENABLED | dvara.flightdeck.playground.enabled | Enable the Flightdeck Playground (Console /playground for owner + Portal /portal/playground for tenant admin / developer; default: true). Set false in production to prevent ad-hoc LLM calls from the Console + Portal. |
DVARA_ENCRYPTION_MASTER_PASSWORD | dvara.encryption.master-password | AES-256-GCM master password for ENC: decryption and database credential storage. Required for credentials created in ENCRYPTED storage mode (the default). Optional for REFERENCE-only deployments — REFERENCE credentials store only a vault pointer and never hit the AES path, so a zero-trust deployment that creates all credentials with storageMode=REFERENCE can omit this property entirely. |
DVARA_FLIGHTDECK_SECURITY_ENABLED | dvara.flightdeck.security.enabled | Enable authentication for admin endpoints (default: true). When true with no OIDC issuer-uri, built-in email/password auth activates. |
DVARA_FLIGHTDECK_OIDC_ISSUER_URI | dvara.flightdeck.security.oidc.issuer-uri | OIDC issuer for JWT validation |
DVARA_FLIGHTDECK_OIDC_AUDIENCE | dvara.flightdeck.security.oidc.audience | Expected JWT audience |
DVARA_FLIGHTDECK_OIDC_ROLE_CLAIM | dvara.flightdeck.security.oidc.role-claim | JWT claim for roles (default: roles) |
DVARA_FLIGHTDECK_RBAC_ENABLED | dvara.flightdeck.security.rbac.enabled | Enable URL-pattern RBAC (default: true) |
DVARA_AUDIT_HMAC_SECRET | dvara.audit.hmac-secret | HMAC-SHA256 key for signing audit events |
DVARA_AUDIT_MAX_EVENTS | dvara.audit.max-events | Audit store capacity (default: 100000) |
DVARA_VAULT_BACKEND | dvara.vault.backend | Vault backend: hashicorp, aws-secrets-manager, azure-key-vault |
DVARA_LLM_GATEWAY_IP_ACCESS_ENABLED | dvara.llm-gateway.ip-access.enabled | Enable IP allowlist/denylist (default: false) |
DVARA_LLM_GATEWAY_WEBHOOKS_ENABLED | dvara.llm-gateway.webhooks.enabled | Enable webhook delivery (default: true) |
DVARA_DB_POOL_SIZE | spring.datasource.hikari.maximum-pool-size | JDBC connection pool size (default: 2 for data plane) |
DVARA_CONFIG_FILE | — | Path to gateway.yaml (default: ./gateway.yaml) |
DVARA_API_KEY | — | Static API key for standalone mode |
DVARA_BOOTSTRAP_FILE | — | Path to bootstrap.yaml for first-startup seeding |
Provider Credential Management
Provider API keys (OpenAI, Anthropic, Gemini, etc.) can be configured through three methods. The gateway checks them in priority order — the first match wins:
1. Database credential (DVARA Flightdeck / API) → highest priority
2. Vault (HashiCorp, AWS SM, Azure KV) → middle priority
3. Environment variable / property → lowest priority (fallback)
Method 1: Environment Variables (simplest)
Set the API key as an environment variable. This is the quickest way to get started:
export OPENAI_API_KEY=sk-your-key
export ANTHROPIC_API_KEY=sk-ant-your-key
The gateway reads these at startup. To change a key, you must restart the gateway.
Method 2: DVARA Flightdeck / API (recommended for teams)
Store credentials in the database through the DVARA Flightdeck or REST API. Keys are encrypted with AES-256-GCM before storage and decrypted at request time. No restart needed for rotation.
Prerequisites: Set DVARA_ENCRYPTION_MASTER_PASSWORD for AES encryption.
Via the DVARA Flightdeck:
- Navigate to Credentials → Add Credential
- Select the provider and enter the API key
- The key is encrypted and stored — it takes effect immediately
Via the API:
# Create a credential
curl -X POST http://localhost:8090/v1/admin/credentials \
-H "Content-Type: application/json" \
-d '{
"name": "openai-prod",
"provider": "openai",
"secretKey": "provider.openai.api-key",
"apiKey": "sk-your-actual-key"
}'
# Rotate (new key takes effect immediately)
curl -X POST http://localhost:8090/v1/admin/credentials/{id}/rotate \
-H "Content-Type: application/json" \
-d '{"apiKey": "sk-new-key"}'
# Revoke (falls back to vault or env var)
curl -X POST http://localhost:8090/v1/admin/credentials/{id}/revoke
Method 3: Vault (recommended for production)
Delegate credential storage to an external secrets manager. See Vault Configuration below.
How Priority Works — Examples
Example 1: Database overrides env var
# Env var set at startup
OPENAI_API_KEY=sk-old-key
# Later, admin adds a credential via UI for provider.openai.api-key
# → Gateway immediately uses the database credential, not the env var
Example 2: Revoke falls back to env var
# Database credential for OpenAI: ACTIVE
# Env var: OPENAI_API_KEY=sk-fallback-key
# Admin revokes the database credential
# → Gateway falls back to sk-fallback-key from the env var
Example 3: Database + vault coexistence
# Vault has provider.openai.api-key = sk-from-vault
# Database has ACTIVE credential for provider.openai.api-key = sk-from-db
# → Gateway uses sk-from-db (database wins over vault)
# If database credential is revoked → sk-from-vault is used
# If vault entry is also removed → falls back to OPENAI_API_KEY env var
:::tip When to use which method
- Environment variables — local development, CI/CD, single-operator setups
- DVARA Flightdeck / API — team environments where multiple people manage credentials, or when you need rotation without restarts
- Vault — production environments with strict secret management policies :::
Authentication Configuration
DVARA uses a two-domain authentication model: the data plane and the admin API are secured independently.
-
Data plane (
/v1/chat/completions,/v1/embeddings, etc.) is always protected by API keys. Every request must include a validAuthorization: Bearer <api-key>header. This is active regardless of thedvara.flightdeck.security.enabledsetting. -
Admin API (
/v1/admin/*) authentication is enabled by default (dvara.flightdeck.security.enabled=true). Three authentication modes are available:- Built-in email/password (default) — activates when
security.enabled=trueand no OIDCissuer-urior SAMLmetadata-urlis set. On first visit,/setupcreates the initial admin account. No external IdP required. - OIDC/JWT — activates when
dvara.flightdeck.security.oidc.issuer-uriis set. Requires an external IdP (Keycloak, Okta, Azure AD, Auth0). - SAML 2.0 SSO — activates when
dvara.flightdeck.security.saml.metadata-urlis set (mutually exclusive with OIDC).
- Built-in email/password (default) — activates when
dvara:
flightdeck:
security:
enabled: true # Default: admin API requires authentication
# Built-in auth activates automatically when no OIDC/SAML is configured.
# To use OIDC instead, set the issuer-uri:
oidc:
issuer-uri: ${DVARA_FLIGHTDECK_OIDC_ISSUER_URI:}
audience: ${DVARA_FLIGHTDECK_OIDC_AUDIENCE:}
role-claim: ${DVARA_FLIGHTDECK_OIDC_ROLE_CLAIM:roles}
name-claim: ${DVARA_FLIGHTDECK_OIDC_NAME_CLAIM:name}
tenant-claim: ${DVARA_FLIGHTDECK_OIDC_TENANT_CLAIM:tenant_id}
rbac:
enabled: true # URL-pattern RBAC when security is enabled
session:
timeout-seconds: 3600
Setting dvara.flightdeck.security.enabled=false disables all admin authentication. Anyone with network access to port 8090 can modify gateway configuration. Only use this for local development.
When dvara.flightdeck.security.enabled=true:
- Without OIDC or SAML configured, built-in email/password authentication activates automatically. The first user is created via the
/setuppage. Additional users are invited by existing admins. - With
issuer-uriconfigured, JWT tokens are validated against the configured issuer via OIDC discovery. - Roles are extracted from the JWT claim specified by
role-claimand mapped to DVARA's six built-in roles:owner,policy-admin,billing-admin(platform), plusadmin,developer,viewer(tenant). - When
rbac.enabled=true(the default when security is on), URL-pattern RBAC rules enforce fine-grained access to each admin endpoint. - When
rbac.enabled=false, any authenticated user can access all admin endpoints.
Tenant Configuration
Per-tenant configuration is not set via YAML files. Instead, tenant settings are managed through the Admin API (POST /PUT /v1/admin/tenants) as metadata key-value pairs on each tenant object.
This means tenant-level customization (PII policy, guardrail behavior, cost controls, IP access, approval gates, etc.) is dynamic and does not require a gateway restart.
# Per-tenant config is set via Tenant.metadata (not YAML)
# Managed through Admin API: POST/PUT /v1/admin/tenants
# Example metadata keys:
# priority-tier: premium|standard|bulk
# pii.enabled: true
# pii.action: BLOCK|REDACT|LOG
# pii.custom-patterns: {"label": "regex"}
# pii.scan-responses: true
# guardrail.enabled: true
# guardrail.action: BLOCK|FLAG|LOG
# guardrail.risk-score-threshold: 0.7
# cost.downgrade-threshold-pct: 80
# cost.downgrade-rules: "gpt-4o:gpt-4o-mini,claude-3-opus:claude-3-sonnet"
# cost.anomaly-threshold-pct: 200
# ip-access.allowlist: "10.0.0.0/8,172.16.0.0/12"
# ip-access.denylist: "192.168.1.100/32"
# approval.required-tools: "database_*,filesystem_*"
# approval.required-servers: "server-a,server-b"
# approval.timeout-seconds: 300
# audit.store-prompts: true
# agentic.loop-detection.enabled: true
# agentic.loop-detection.repetition-threshold: 5
# agentic.loop-detection.auto-kill: false
Example API call to set tenant metadata:
curl -X PUT http://localhost:8090/v1/admin/tenants/acme-corp \
-H "Content-Type: application/json" \
-d '{
"name": "Acme Corp",
"metadata": {
"priority-tier": "premium",
"pii.enabled": "true",
"pii.action": "REDACT",
"guardrail.enabled": "true",
"guardrail.action": "BLOCK",
"cost.downgrade-rules": "gpt-4o:gpt-4o-mini",
"ip-access.allowlist": "10.0.0.0/8",
"audit.store-prompts": "true"
}
}'
Global defaults for PII, guardrails, budgets, and other enterprise features are configured in application.yml under their respective namespaces (e.g., dvara.llm-gateway.pii.*, dvara.llm-gateway.guardrail.*, dvara.llm-gateway.finops.*). Tenant metadata overrides these global defaults on a per-tenant basis.
Per-Tenant Metadata Key Reference
Every per-tenant override key the gateway recognises. Values are strings unless noted; booleans accept true/false/yes/no/on/off/1/0 (case-insensitive) per Spring's relaxed binding.
| Key | Type | Subsystem | Behaviour |
|---|---|---|---|
priority-tier | premium / standard / bulk | Routing | Tenant's priority class for concurrency-based admission control. See Priority routing. |
credentials.require-tenant-credential | boolean | BYOK | Narrows or relaxes the global strict-BYOK policy (dvara.credentials.require-tenant-credential) for this tenant. Useful for tiered plans where free-tier tenants share the platform-default credential and paid tiers must BYOK. |
pat.max-ttl-days | int (≤ 365) | Auth | Narrows the platform PAT TTL ceiling for this tenant. Hard cap of 365 always applies. |
pii.enabled | boolean | PII | Override PII detection enable/disable for this tenant. |
pii.action | BLOCK / REDACT / LOG | PII | Override default PII action. |
pii.scan-responses | boolean | PII | Override response scanning. |
pii.scan-streaming-responses | boolean | PII | Override streaming-response scanning. |
pii.custom-patterns | JSON map | PII | Custom regex patterns: {"label": "regex", ...} |
phileas.enabled-filters | CSV of FilterType names | PII | Restrict the embedded Phileas scanner to a subset of its 17 filter types for this tenant. |
guardrail.enabled | boolean | Guardrails | Override guardrail detection. |
guardrail.action | BLOCK / FLAG / LOG | Guardrails | Override default action. |
guardrail.risk-score-threshold | double (0.0–1.0) | Guardrails | Override risk-score threshold. |
guardrail.scan-streaming-responses | boolean | Guardrails | Override streaming-response scanning. |
guardrail.max-input-tokens | int | Guardrails | Override OWASP LLM10 input-token cap. |
guardrail.max-messages-per-request | int | Guardrails | Override message count cap. |
guardrail.max-message-length | int | Guardrails | Override per-message char cap. |
guardrail.default-max-response-tokens | int | Guardrails | Override applied when client omits max_tokens. |
guardrail.injection.custom-patterns | JSON map | Guardrails | Custom injection patterns. |
guardrail.content.profanity.action | BLOCK / FLAG / LOG | Guardrails | Per-category action override. |
guardrail.content.violence.action | BLOCK / FLAG / LOG | Guardrails | Per-category action override. |
guardrail.content.sexual.action | BLOCK / FLAG / LOG | Guardrails | Per-category action override. |
guardrail.content.competitor.keywords | CSV | Guardrails | Competitor brand keywords. |
guardrail.content.topic-restrictions | CSV | Guardrails | Restricted topic keywords. |
guardrail.content.custom-denylist | JSON map | Guardrails | Custom deny-list patterns. |
guardrail.mcp-injection.enabled | boolean | Guardrails | Enable MCP injection scanning. |
guardrail.mcp-injection.action | BLOCK / FLAG / SANITIZE | Guardrails | MCP injection action. |
guardrail.context.warning-threshold-pct | int (0–100) | Guardrails | Context-window warning threshold (default 70). |
guardrail.context.hard-threshold-pct | int (0–100) | Guardrails | Context-window hard threshold (default 90). |
guardrail.context.pruning-strategy | NONE / TRUNCATE_OLDEST / TRUNCATE_MIDDLE | Guardrails | Pruning strategy on context overflow. |
guardrail.plugins | JSON map | Guardrails | Per-plugin config: {"pluginName": {"enabled": false, ...}} |
grounding.enabled | boolean | Grounding | Override embedding-based hallucination detection. |
grounding.action | BLOCK / FLAG / LOG | Grounding | Override action on ungrounded claims. |
grounding.max-sources | int | Grounding | Override source-document cap. |
grounding.max-source-length | int | Grounding | Override per-source char cap. |
cost.downgrade-threshold-pct | int (0–100) | FinOps | Budget utilization % at which model downgrade triggers (default: 80). |
cost.downgrade-rules | CSV from:to pairs | FinOps | Model downgrade rules, e.g. gpt-4o:gpt-4o-mini,claude-3-opus:claude-3-sonnet. |
cost.anomaly-threshold-pct | int | FinOps | Per-tenant override of dvara.llm-gateway.finops.anomaly-threshold-pct (default: 200 = 2× baseline). |
ip-access.allowlist | CSV CIDRs | Network | Per-tenant IP allowlist. |
ip-access.denylist | CSV CIDRs | Network | Per-tenant IP denylist. |
approval.required-tools | CSV glob patterns | Agentic | Tools that require human-in-the-loop approval (e.g. database_*,filesystem_*). |
approval.required-servers | CSV server IDs | Agentic | MCP servers whose every tool call requires approval. |
approval.timeout-seconds | int | Agentic | Per-tenant approval timeout override. |
approval.default-action | approve / deny | Agentic | Action on timeout. |
agentic.loop-detection.enabled | boolean | Agentic | Override loop detection. |
agentic.loop-detection.repetition-threshold | int | Agentic | Override consecutive same-tool threshold. |
agentic.loop-detection.max-calls-per-minute | int | Agentic | Override rate-limit threshold. |
agentic.loop-detection.auto-kill | boolean | Agentic | Auto-kill session on loop detection. |
audit.store-prompts | boolean | Audit | Opt-in prompt storage for compliance. |
audit.archive.retention-days | int | Audit | Per-tenant override for the archive job's PG retention window. |
Use PUT /v1/admin/tenants/{id} with a metadata JSON object on the request body to set any of these keys; unsetting is null (the global default takes over).
Bootstrap File Seeding
Set DVARA_BOOTSTRAP_FILE to a YAML file path to seed tenants, API keys, and routes on first startup. This is designed for first-time tenant seeding before the Flightdeck UI is in use; entries are idempotently upserted on every boot, so re-running with the same file is safe.
docker run -d --name dvara-gateway \
-p 8080:8080 \
-v $(pwd)/bootstrap.yaml:/etc/dvara/bootstrap.yaml \
-e DVARA_BOOTSTRAP_FILE=/etc/dvara/bootstrap.yaml \
ghcr.io/dvarahq/dvara/dvara-llm-gateway:latest
The bootstrap loader is idempotent: it checks the config version and skips seeding if any configuration has already been applied (from a previous startup or from gateway.yaml).
tenants:
- id: acme-corp
name: Acme Corp
status: active
api_keys:
- tenant: acme-corp
name: production-key
key: ${ACME_API_KEY} # env var reference
- tenant: acme-corp
name: dev-key
generate: true # auto-generated, printed at startup
routes:
- id: gpt-route
model: "gpt*"
provider: openai
Notes on the bootstrap flow:
- The file performs idempotent first-startup seeding of tenants, API keys, and routes. Re-running with the same file is safe — entries are upserted, never duplicated.
BootstrapLoaderruns after the gateway has booted with a valid license envelope (DVARA_LICENSE_KEYis always required at startup; there is no "unlicensed mode"). - It supports
${ENV_VAR}and${ENV_VAR:-default}substitution; resolution is performed by the gateway's YAML loader before parsing, so any SpringEnvironmentsource — env vars, JVM-Dprops, Kubernetes ConfigMaps — feeds back in. - The legacy
GATEWAY_BOOTSTRAP_FILEenv var is accepted for one release with a deprecationWARN(removed in 1.1.0-GA). Switch toDVARA_BOOTSTRAP_FILEfor new deployments. - After first startup, ongoing config changes are made through the Flightdeck UI or
/v1/admin/*REST API; subsequent boots skip rows whose id already exists in the database.
Provider Activation
A provider bean is only registered when its activation condition is met:
| Provider | Activation Condition |
|---|---|
| OpenAI | OPENAI_API_KEY is set and non-blank |
| Anthropic | ANTHROPIC_API_KEY is set and non-blank |
| Gemini | GEMINI_API_KEY is set and non-blank |
| Ollama | OLLAMA_ENABLED=true |
| Bedrock | BEDROCK_ENABLED=true |
| Azure OpenAI | AZURE_OPENAI_API_KEY + AZURE_OPENAI_BASE_URL both set |
| Mistral | MISTRAL_API_KEY is set and non-blank |
| Cohere | COHERE_API_KEY is set and non-blank |
| Groq | GROQ_API_KEY is set and non-blank |
| Qwen | QWEN_API_KEY is set and non-blank |
| DeepSeek | DEEPSEEK_API_KEY is set and non-blank |
| Moonshot | MOONSHOT_API_KEY is set and non-blank |
| ChatGLM | ZHIPU_API_KEY is set and non-blank |
| Grok | XAI_API_KEY is set and non-blank |
| Mock | MOCK_PROVIDER_ENABLED=true |
If no provider is configured for a requested model, the gateway returns HTTP 400 with error code no_provider.
Property Reference
dvara.llm-gateway.providers.*
See Provider Setup for per-provider configuration details.
gateway.routes[]
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
id | string | yes | — | Unique route identifier |
model-pattern | string | yes | — | Glob pattern to match model names |
strategy | string | no | model-prefix | model-prefix, round-robin, weighted, latency-aware, cost-aware, canary, geo-aware, intelligent |
cost-tolerance-pct | int | no | 0 | Cost tolerance band (0–100) for cost-aware routing — providers within this % above the cheapest may be preferred by the latency tiebreak. Not read by other strategies. |
latency-sla-ms | long | no | 0 | Latency SLA in ms for cost-aware routing (0 = no SLA) |
model-tiers | map | no | — | Complexity→model mapping for intelligent routing (keys: SIMPLE, MODERATE, COMPLEX) |
providers[].provider | string | yes | — | Provider name |
providers[].weight | int | no | 1 | Weight for weighted strategy |
providers[].region | string | no | — | Region affinity for this provider entry |
pinned-model-version | string | no | — | Override model name sent to provider |
dvara.llm-gateway.resilience.*
| Property | Type | Default | Description |
|---|---|---|---|
resilience.retry.max-attempts | int | 3 | Max retry attempts on failure |
resilience.retry.wait-duration-ms | int | 500 | Wait between retries (ms) |
resilience.circuit-breaker.failure-rate-threshold | int | 50 | Failure % to open circuit |
resilience.circuit-breaker.sliding-window-size | int | 20 | Calls in sliding window |
resilience.circuit-breaker.wait-duration-in-open-state-ms | int | 30000 | Time before half-open (ms) |
resilience.timeout.chat-timeout-ms | int | 30000 | Non-streaming request timeout |
resilience.timeout.streaming-timeout-ms | int | 120000 | Streaming request timeout |
dvara.llm-gateway.rate-limit.*
| Property | Type | Default | Description |
|---|---|---|---|
rate-limit.enabled | boolean | false | Enable rate limiting |
rate-limit.per-key.requests-per-minute | int | 100 | Max requests per API key per 60-second window |
rate-limit.per-key.tokens-per-minute | int | 100000 | Max tokens per API key per 60-second window |
dvara.llm-gateway.cache.*
| Property | Type | Default | Description |
|---|---|---|---|
cache.enabled | boolean | false | Enable response caching |
cache.ttl-seconds | int | 3600 | Cache entry time-to-live |
cache.max-size | int | 10000 | Max entries (Caffeine only) |
dvara.llm-gateway.pii.* (Enterprise — PII Detection)
| Property | Type | Default | Description |
|---|---|---|---|
pii.enabled | boolean | true | Enable PII detection and enforcement |
pii.provider | string | regex | Detection provider: regex or presidio |
pii.default-action | string | LOG | Default action: LOG, BLOCK, REDACT |
pii.scan-responses | boolean | true | Scan LLM responses for PII output leaks |
pii.strip-before-cache | boolean | true | Always redact PII before writing to cache |
pii.token-encryption-password | string | — | AES-256-GCM password for PII token encryption |
pii.max-tokens-per-tenant | int | 50000 | Maximum PII tokens stored per tenant |
pii.token-retention-days | int | 30 | Days to retain PII tokens before expiry |
pii.presidio.endpoint | string | — | Presidio analyzer endpoint URL |
pii.presidio.language | string | en | Presidio analysis language |
pii.presidio.score-threshold | double | 0.5 | Minimum Presidio confidence score |
pii.presidio.timeout-seconds | int | 5 | Presidio HTTP timeout |
pii.presidio.cache-max-size | int | 1000 | LRU cache max entries (0 = disabled) |
pii.presidio.cache-ttl-seconds | int | 300 | Cache entry TTL |
pii.scan-streaming-responses | boolean | true | Scan SSE streaming responses for PII. Buffers a rolling window across chunk boundaries so spans split across two SSE events are still caught. |
pii.streaming-scan-window-size | int | 256 | Characters buffered before a PII scan triggers during streaming. |
pii.streaming-overlap-margin | int | 64 | Characters retained between scan windows so PII spanning chunk boundaries is not missed. |
Per-tenant PII configuration is set via tenant metadata keys:
| Metadata Key | Values | Description |
|---|---|---|
pii.enabled | true / false | Override PII detection for this tenant |
pii.action | BLOCK / REDACT / LOG | Override default PII action for this tenant |
pii.scan-responses | true / false | Override response scanning for this tenant |
pii.scan-streaming-responses | true / false | Override streaming-response scanning for this tenant |
pii.custom-patterns | JSON string | Custom regex patterns ({"label": "regex", ...}) |
cost.downgrade-threshold-pct | int (0–100) | Budget utilization % at which model downgrade triggers (default: 80) |
cost.downgrade-rules | string | Comma-separated model downgrade rules (format: "from:to,from:to", e.g. "gpt-4o:gpt-4o-mini,claude-3-opus:claude-3-sonnet") |
cost.anomaly-threshold-pct | int | Per-tenant anomaly detection threshold (default: 200 = 2x baseline) |
See PII Detection for full configuration details.
dvara.llm-gateway.guardrail.* (Enterprise — Guardrails & Safety)
| Property | Type | Default | Description |
|---|---|---|---|
guardrail.enabled | boolean | true | Enable guardrail detection and enforcement |
guardrail.default-action | string | LOG | Default action: LOG, BLOCK, FLAG |
guardrail.scan-responses | boolean | true | Scan LLM responses for violations |
guardrail.risk-score-threshold | double | 0.7 | Detections below this score are ignored |
guardrail.max-input-tokens | int | 32000 | Max estimated input tokens (OWASP LLM10) |
guardrail.max-messages-per-request | int | 100 | Max messages per request (OWASP LLM10) |
guardrail.max-message-length | int | 50000 | Max character length per message (OWASP LLM10) |
guardrail.default-max-response-tokens | int | 4096 | Applied when client doesn't specify max_tokens |
guardrail.scan-streaming-responses | boolean | true | Enable guardrail scanning on streaming responses |
guardrail.streaming-scan-window-size | int | 256 | Chars buffered before scan trigger during streaming |
guardrail.streaming-overlap-margin | int | 64 | Chars retained between windows for boundary detection |
guardrail.ml-classifier.enabled | boolean | false | Enable ML classifier for injection detection |
guardrail.ml-classifier.provider | string | generic | Provider: generic, lakera, shield-gemini |
guardrail.ml-classifier.endpoint | string | — | URL (auto-defaulted per provider if blank) |
guardrail.ml-classifier.api-key | string | — | API key for the ML provider |
guardrail.ml-classifier.project-id | string | — | Google Cloud project ID (shield-gemini only) |
guardrail.ml-classifier.location | string | us-central1 | Google Cloud region (shield-gemini only) |
guardrail.ml-classifier.confidence-threshold | double | 0.8 | Detections below this are ignored |
guardrail.ml-classifier.timeout-seconds | int | 5 | HTTP call timeout |
guardrail.ml-classifier.cache-max-size | int | 1000 | LRU cache max entries. 0 = disabled |
guardrail.ml-classifier.cache-ttl-seconds | int | 300 | Cache entry TTL |
guardrail.plugins.enabled | boolean | false | Enable external guardrail plugin system |
guardrail.plugins.definitions[].name | string | — | Unique plugin identifier |
guardrail.plugins.definitions[].url | string | — | HTTP endpoint URL |
guardrail.plugins.definitions[].secret | string | — | HMAC-SHA256 signing secret |
guardrail.plugins.definitions[].timeout-ms | int | 5000 | HTTP call timeout |
guardrail.plugins.definitions[].fail-mode | string | OPEN | OPEN or CLOSED |
guardrail.grounding.enabled | boolean | false | Enable hallucination/grounding detection |
guardrail.grounding.similarity-threshold | double | 0.7 | Cosine similarity threshold for claims |
guardrail.grounding.action | string | LOG | Action: LOG, FLAG, BLOCK |
Per-tenant guardrail configuration is set via tenant metadata keys:
| Metadata Key | Values | Description |
|---|---|---|
guardrail.enabled | true / false | Override guardrail detection for this tenant |
guardrail.action | BLOCK / FLAG / LOG | Override default action for this tenant |
guardrail.risk-score-threshold | double (0.0–1.0) | Override risk score threshold |
guardrail.max-input-tokens | int | Override max estimated input tokens |
guardrail.max-messages-per-request | int | Override max messages per request |
guardrail.max-message-length | int | Override max message character length |
guardrail.default-max-response-tokens | int | Override default response token cap |
guardrail.content.profanity.action | BLOCK / FLAG / LOG | Per-category action override |
guardrail.content.violence.action | BLOCK / FLAG / LOG | Per-category action override |
guardrail.content.sexual.action | BLOCK / FLAG / LOG | Per-category action override |
guardrail.content.competitor.keywords | comma-separated | Competitor brand keywords |
guardrail.content.topic-restrictions | comma-separated | Restricted topic keywords |
guardrail.content.custom-denylist | JSON string | Custom deny-list patterns |
guardrail.injection.custom-patterns | JSON string | Custom injection patterns |
guardrail.mcp-injection.enabled | true / false | Enable MCP injection scanning |
guardrail.mcp-injection.action | BLOCK / FLAG / SANITIZE | MCP injection action |
guardrail.context.warning-threshold-pct | int (0–100) | Context window warning threshold (default: 70) |
guardrail.context.hard-threshold-pct | int (0–100) | Context window hard threshold (default: 90) |
guardrail.context.pruning-strategy | NONE / TRUNCATE_OLDEST / TRUNCATE_MIDDLE | Context pruning strategy |
guardrail.scan-streaming-responses | true / false | Override streaming-response scanning for this tenant |
guardrail.max-input-tokens | int | Per-tenant override (OWASP LLM10) |
guardrail.max-messages-per-request | int | Per-tenant override (OWASP LLM10) |
guardrail.max-message-length | int | Per-tenant override (OWASP LLM10) |
guardrail.default-max-response-tokens | int | Per-tenant override applied when client omits max_tokens |
guardrail.plugins | JSON map | Per-plugin config: {"pluginName": {"enabled": false}} |
See Guardrails & Safety for full configuration details.
dvara.llm-gateway.guardrail.phileas.* (Enterprise — Embedded PII Scanner)
In-process PII detection via the embedded Phileas library — runs entirely in-process, no external service required, no network calls, no vendor API keys, sub-millisecond per-call latency. Pairs with the always-on RegexPiiDetector; for production credit-card detection, rely on the regex path's Luhn validation since Phileas matching is best-effort.
| Property | Type | Default | Description |
|---|---|---|---|
guardrail.phileas.enabled | boolean | false | Enable the Phileas PII scanner. The default policy enables 17 filter types: SSN, credit card, phone, email, IP, passport, drivers license, IBAN, MAC, URL, ZIP, bank routing number, VIN, bitcoin address, tracking number, currency, age. |
Per-tenant restriction via Tenant.metadata["phileas.enabled-filters"] (comma-separated FilterType names) — editable from the DVARA Console tenant form (Phileas Filters tab) or via the Automation API (PUT /v1/admin/tenants/{id}).
dvara.llm-gateway.guardrail.grounding.* (Enterprise — Hallucination / Grounding Detection)
Embedding-based grounding detection. When source documents are provided on the request via request.metadata["grounding.sources"] (List<String>), the detector splits the response into sentence claims, embeds each claim and source via EmbeddingService, and flags claims with max cosine similarity below the threshold.
| Property | Type | Default | Description |
|---|---|---|---|
guardrail.grounding.enabled | boolean | false | Enable embedding-based grounding detection |
guardrail.grounding.similarity-threshold | double | 0.7 | Cosine similarity threshold — claims below this are flagged ungrounded |
guardrail.grounding.action | string | LOG | LOG, FLAG, or BLOCK when a hallucination is detected |
guardrail.grounding.max-sources | int | 50 | Maximum source documents per request; excess silently dropped |
guardrail.grounding.max-source-length | int | 10000 | Maximum chars per source document; oversized sources silently dropped |
Per-tenant overrides via Tenant.metadata: grounding.enabled, grounding.action, grounding.max-sources, grounding.max-source-length. Streaming requests run the same detector at stream-end against the accumulated response text.
dvara.llm-gateway.finops.* (Enterprise — Cost Calculation)
| Property | Type | Default | Description |
|---|---|---|---|
finops.enabled | boolean | true | Enable enterprise cost calculation |
finops.pricing-cache-ttl-seconds | int | 60 | TTL for pricing lookup cache (0 = disabled) |
finops.spend-cache-ttl-seconds | int | 30 | TTL for budget spend cache (avoids per-request DB queries) |
finops.soft-alert-cooldown-minutes | int | 60 | Per-budget-cap dedup window for soft-limit alerts. The first soft breach on a cap writes BUDGET_CAP_SOFT (which fans out to webhook subscribers and the audit log); subsequent soft breaches on the same cap within this window are skipped — no audit event, no webhook delivery. The Prometheus counter gateway_budget_soft_alert_total is incremented on every breach regardless. Operational caveat: a tenant whose spend straddles the soft threshold over many requests will not generate a per-call alert — operators wiring on-call paging that expects per-call notifications should lower this to seconds (e.g. 1) or zero. The cooldown is per budget id, not per tenant, so a tenant with three caps (global / tenant / API-key) can still see up to three soft alerts within the window. Model downgrade (ModelDowngradeFilter) is not subject to the cooldown — it runs on every request whose BudgetCheckResult.softLimitBreached() is true, so downgrade continues to apply silently between alert fires. |
finops.downgrade-cache-ttl-seconds | int | 5 | TTL for per-tenant downgrade policy cache (0 = disabled) |
finops.chargeback-schedule | string | "" | Cron expression for monthly chargeback report auto-generation (empty = disabled) |
finops.anomaly-threshold-pct | int | 200 | Default anomaly threshold: current daily rate vs 30d baseline (200 = 2x) |
finops.budget-warning-threshold-pct | int | 75 | Default utilization % threshold at which WARN_AGENT policy rules fire. When a tenant's spend on a budget cap crosses this fraction of the cap's limit, any policy with a WARN_AGENT action sees the warning state and can inject a soft-cap notice into the response without blocking the call. Per-budget override via the budget cap's softLimitPct field; per-tenant override via the policy YAML's threshold expression. |
Cost calculation uses ModelPricing entries managed via the /v1/admin/pricing API. Pricing supports glob patterns (e.g. gpt-4o* matches gpt-4o, gpt-4o-mini). Costs are calculated per-request using BigDecimal precision and persisted as CostRecord entries, queryable via /v1/admin/costs.
Budget caps enforce soft and hard spending limits at three scopes — global, tenant, and API-key. The data plane evaluates all applicable caps for each request in most-specific-first order (API-key → tenant → global) and short-circuits on the first hard or soft breach; for non-breach cases the tightest-remaining-percent cap is surfaced via BudgetCheckResult.allowedWithBudget and rendered into response headers. Configure budgets via /v1/admin/budgets. Budget periods (DAILY, WEEKLY, MONTHLY) reset automatically at UTC boundaries.
Hard breach → BudgetEnforcementFilter rejects the request with HTTP 402 Payment Required and writes a BUDGET_CAP_HARD audit event. No retries, no downgrade — the request never reaches the provider.
Soft breach → the request is allowed through with the breach state stamped onto the FilterContext. Two side-effects fire from the soft state:
- A
BUDGET_CAP_SOFTaudit event is written (subject to thesoft-alert-cooldown-minutesdedup window above). Webhook subscribers with theBUDGET_CAP_SOFTevent type get the payload via the audit→webhook bridge. ModelDowngradeFilterreads the soft-breach signal and, ifTenant.metadata["cost.downgrade-rules"]defines a rule for the requested model, swaps the model in-flight (e.g.gpt-4o→gpt-4o-mini) before dispatch. Downgrade fires on every soft-breach request — it is not subject to the soft-alert cooldown, so spend continues to bend toward the cheaper variant even when alerts are paused.
The downgrade threshold (default 80% of the cap) and downgrade rules are per-tenant via cost.downgrade-threshold-pct and cost.downgrade-rules metadata keys.
Chargeback reports aggregate costs by tenant, API key, model, provider, and time period. Reports are exportable as CSV and PDF. Monthly auto-generation is available via chargeback-schedule cron. Cost forecasting uses trailing 7-day and 30-day spend trends with linear projection. Cost anomaly detection compares current daily spend rate against the 30-day baseline; alerts fire when the deviation exceeds the configured threshold (per-tenant override via cost.anomaly-threshold-pct metadata key).
Enterprise License
A valid DVARA license key is required for all apps to start. Set DVARA_LICENSE_KEY as an environment variable.
| Property | Environment Variable | Default | Description |
|---|---|---|---|
dvara.license.key | DVARA_LICENSE_KEY | — | Signed DVARA license key (trial or production) |
dvara.license-monitor.interval-ms | DVARA_LICENSE_MONITOR_INTERVAL_MS | 3600000 (1 hour) | Interval for runtime license re-validation. Each sweep re-checks the envelope's signature and expiry, updates LicenseStatusHolder, and writes a LICENSE_* audit event on status transitions. |
Runtime license lifecycle:
- Within 30 days of expiry —
EXPIRING_SOON:X-License-Warningresponse header added,LICENSE_EXPIRY_WARNINGaudit event - Expired (within 14-day grace) —
GRACE_PERIOD:X-License-Warningheader,LICENSE_EXPIREDaudit event, service fully operational - Expired (beyond the 14-day grace) —
DEGRADED: data-plane/v1/*returns 402 Payment Required, admin/internal endpoints remain accessible,LICENSE_DEGRADEDaudit event
dvara.flightdeck.license-alert.* (License-Expiry Email Alerts)
Operator-friendly email notifications on license status transitions. The alerting itself is driven by Spring events published by LicenseExpiryMonitor in enterprise-core — no schedule config needed. Off by default so existing deploys don't start sending alerts on a license-key rotation just because the rc23+ images landed.
| Property | Default | Description |
|---|---|---|
dvara.flightdeck.license-alert.enabled | false | Master switch. Opt-in. |
dvara.flightdeck.license-alert.recipients | [] | Email addresses to notify on LICENSE_EXPIRY_WARNING / LICENSE_EXPIRED / LICENSE_DEGRADED transitions. Comma-separated in env vars, list in YAML. At least one address is required when enabled=true — misconfigured state logs a WARN at boot but doesn't refuse to start (alerting is non-critical to data-plane uptime). |
dvara.flightdeck.license-alert.subject-prefix | [DVARA license alert] | Override for the email subject line prefix. Useful when a single ops mailbox receives alerts from multiple DVARA installs (e.g. [DVARA license alert — prod], [DVARA license alert — staging]). |
Enterprise Latency-Aware Routing Configuration
Tunes the latency-aware routing strategy (see Routing → Latency-aware) — EWMA per-(provider, model) latency tracking with a configurable freshness window and decay penalty for stale samples. Requires a valid DVARA license key.
| Property | Environment Variable | Default | Description |
|---|---|---|---|
dvara.llm-gateway.routing.latency.alpha | — | 0.2 | EWMA smoothing factor (0.0–1.0). Higher values give more weight to recent samples |
dvara.llm-gateway.routing.latency.decay-threshold-ms | — | 60000 | Staleness threshold in milliseconds. Entries older than this receive a decay penalty |
dvara.llm-gateway.routing.latency.decay-multiplier | — | 0.5 | Stale EWMA is divided by this value (lower = harsher penalty) |
dvara.llm-gateway.routing.latency.min-samples | — | 5 | Minimum latency samples before EWMA is used for routing decisions |
dvara.llm-gateway.routing.latency.snapshot-interval | — | 100 | Persist latency snapshot to repository every N samples |
Enterprise Priority Routing Configuration
Requires a valid DVARA license key.
| Property | Environment Variable | Default | Description |
|---|---|---|---|
dvara.llm-gateway.routing.priority.enabled | — | false | Enable concurrency-based priority admission control |
dvara.llm-gateway.routing.priority.max-concurrent-requests | — | 1000 | Maximum concurrent requests across all tiers |
dvara.llm-gateway.routing.priority.tiers.premium.throttle-threshold-pct | — | 100 | Load % at which premium requests are throttled |
dvara.llm-gateway.routing.priority.tiers.standard.throttle-threshold-pct | — | 80 | Load % at which standard requests are throttled |
dvara.llm-gateway.routing.priority.tiers.bulk.throttle-threshold-pct | — | 50 | Load % at which bulk requests are throttled |
dvara.llm-gateway.routing.priority.resolver-cache-ttl-seconds | — | 5 | TTL for tenant → priority tier cache |
Enterprise Webhook Configuration
Requires a valid DVARA license key.
| Property | Environment Variable | Default | Description |
|---|---|---|---|
dvara.llm-gateway.webhooks.enabled | DVARA_LLM_GATEWAY_WEBHOOKS_ENABLED | true | Enable webhook delivery of governance events |
dvara.llm-gateway.webhooks.max-retries | DVARA_LLM_GATEWAY_WEBHOOKS_MAX_RETRIES | 3 | Maximum retry attempts for failed deliveries |
dvara.llm-gateway.webhooks.base-retry-delay-ms | DVARA_LLM_GATEWAY_WEBHOOKS_BASE_RETRY_DELAY_MS | 1000 | Base delay in milliseconds for exponential backoff |
dvara.llm-gateway.webhooks.retry-multiplier | DVARA_LLM_GATEWAY_WEBHOOKS_RETRY_MULTIPLIER | 4 | Multiplier for exponential backoff (delay = base x multiplier^attempt) |
dvara.llm-gateway.webhooks.delivery-timeout-ms | DVARA_LLM_GATEWAY_WEBHOOKS_DELIVERY_TIMEOUT_MS | 5000 | HTTP connect+read timeout for webhook delivery |
dvara.llm-gateway.webhooks.approval-base-url | DVARA_LLM_GATEWAY_WEBHOOKS_APPROVAL_BASE_URL | "" (empty) | Base URL for approve/deny links in MCP approval webhooks |
dvara.llm-gateway.webhooks.approval-ttl-minutes | DVARA_LLM_GATEWAY_WEBHOOKS_APPROVAL_TTL | 15 | TTL in minutes for approval token validity |
dvara.llm-gateway.webhooks.max-delivery-log-entries | DVARA_LLM_GATEWAY_WEBHOOKS_MAX_DELIVERY_LOG_ENTRIES | 50000 | Capacity of the delivery log (rolling). Older entries are dropped when capacity is reached. |
dvara.persistence.* (Enterprise — Database Persistence)
Requires a valid DVARA license key.
| Property | Environment Variable | Default | Description |
|---|---|---|---|
spring.datasource.url | SPRING_DATASOURCE_URL | — | JDBC connection URL (e.g. jdbc:postgresql://localhost:5432/dvara). When set with a valid license, PostgreSQL persistence activates automatically. |
dvara.persistence.batch-size | DVARA_PERSISTENCE_BATCH_SIZE | 100 | Batch size for bulk write operations |
spring.datasource.username | SPRING_DATASOURCE_USERNAME | — | Database username |
spring.datasource.password | SPRING_DATASOURCE_PASSWORD | — | Database password |
spring.flyway.enabled | SPRING_FLYWAY_ENABLED | true | Auto-run database migrations on startup |
spring.datasource.hikari.maximum-pool-size | DVARA_DB_POOL_SIZE | 2 | JDBC connection pool size. Data plane (gateway-server, mcp-proxy-server) defaults to 2 (one connection reserved for config reads via the PG NOTIFY listener, one for audit writes) — the data plane is read-mostly and persists asynchronously, so a small pool is correct. Flightdeck (admin server, flightdeck) should be sized to its real concurrency — set DVARA_DB_POOL_SIZE=10 or higher in production for the UI's read-heavy dashboard endpoints + the Admin API. Do not raise the data-plane pool unless you have evidence of contention; raising it just to "be safe" wastes connections that are scarce behind PgBouncer / DigitalOcean Managed Connection Pool. |
# Enterprise persistence (PostgreSQL-backed state)
dvara:
persistence:
batch-size: 100 # default: 100
spring:
datasource:
url: jdbc:postgresql://localhost:5432/dvara
username: dvara
password: ${DB_PASSWORD}
flyway:
enabled: true # auto-runs migrations on startup
Distributed Caching and Rate Limiting
DVARA ships with an embedded distributed cache for API key lookups and rate-limit counters. No external cache infrastructure is required — the cache runs in-process and auto-clusters across pods.
- Local / Docker Compose: pods discover each other via multicast (works out of the box).
- Kubernetes: pods discover each other via DNS lookup against a headless Service (not a regular ClusterIP Service — the Hazelcast Kubernetes discovery SPI needs
EndpointSlicerecords returned by the Service). Create the headless Service withclusterIP: Noneand a label selector matching your data-plane pods, then set theCACHE_SERVICE_NAMEenv var on every pod to the Service name. The Service must live in the same namespace as the pods; cross-namespace discovery requiresCACHE_SERVICE_NAMESPACEas well. WithoutCACHE_SERVICE_NAME, pods boot in single-node mode and rate-limit / cache state is per-pod, not fleet-wide — silently breaking horizontal correctness. The Hazelcast autoconfig logs a clear WARN at startup when this happens.
The distributed cache currently wraps the API key repository with a 30-second TTL for sub-millisecond auth lookups across the fleet. Write-through eviction ensures revoked keys are never served from cache.
Rate limiting uses the same in-process distributed maps for per-key request and token counts shared across all pods. See Rate Limiting for configuration.
MCP Proxy Configuration
Configuration for the standalone DVARA MCP Proxy (port 8070). Requires enterprise license.
| Property | Environment Variable | Default | Description |
|---|---|---|---|
dvara.mcp-gateway.timeout-default | — | 30 | Default timeout in seconds for upstream MCP server requests |
dvara.mcp-gateway.timeout-max | — | 120 | Maximum allowed timeout in seconds (caps any per-server override) |
dvara.mcp-gateway.registry-cache-ttl-seconds | — | 30 | TTL for the MCP server registry cache |
dvara.mcp-gateway.pii.* (MCP Proxy PII Detection)
The MCP Proxy runs its own PII scanner independent of the LLM gateway's dvara.llm-gateway.pii.* settings — MCP tool inputs and outputs are not LLM completions and need their own enable/action knobs.
| Property | Default | Description |
|---|---|---|
dvara.mcp-gateway.pii.enabled | true | Enable PII scanning on MCP tool arguments and responses. |
dvara.mcp-gateway.pii.default-action | LOG | Default PiiAction on detection: LOG, BLOCK, or REDACT. |
dvara.mcp-gateway.pii.scan-responses | true | Scan MCP tool responses (in addition to request arguments) for PII output leaks. |
Per-tenant overrides are managed identically to the LLM-gateway side via tenant metadata.
management.* (Observability)
| Property | Type | Default | Description |
|---|---|---|---|
management.endpoints.web.exposure.include | string | health,prometheus,gateway-status,info | Actuator endpoints to expose. The probe paths (/actuator/health, /actuator/health/liveness, /actuator/health/readiness, /actuator/info) stay anonymous. /actuator/gateway-status requires DVARA_ACTUATOR_API_KEY; /actuator/prometheus requires DVARA_ACTUATOR_METRICS_API_KEY. |
management.endpoints.web.exposure.exclude | string | env,heapdump,threaddump,beans,mappings,configprops,loggers,scheduledtasks,caches,sessions,quartz | Dangerous endpoints — return 404 regardless of authentication. Do not remove from this list without a security review. |
management.endpoint.health.show-details | string | when-authorized | Health detail visibility. Setting this to always is rejected at boot — the readiness probe stays DOWN and a rolling deploy halts on the first replica. Use when-authorized (default) or never. |
management.prometheus.metrics.export.enabled | boolean | true | Enable Prometheus scrape endpoint |
Logging
| Configuration | Value | Description |
|---|---|---|
| Default format | Structured JSON | Every log line is valid JSON |
| Plain-text mode | spring.profiles.active=log-plain | Human-readable output for local dev |
| Log config file | logback-spring.xml | Logging configuration, supports Spring profile overrides |
dvara.flightdeck.security.* (Enterprise — OIDC/JWT Authentication)
Requires a valid DVARA license key.
| Property | Default | Description |
|---|---|---|
dvara.flightdeck.security.enabled | true | Enable authentication for admin endpoints. When true with no OIDC issuer-uri or SAML metadata-url, built-in email/password auth activates. Set to false for local development only. |
dvara.flightdeck.security.oidc.issuer-uri | — | OIDC issuer URL for JWT validation (e.g. Keycloak, Auth0) |
dvara.flightdeck.security.oidc.audience | "" | Expected JWT audience claim (blank = skip audience validation) |
dvara.flightdeck.security.oidc.role-claim | roles | JWT claim containing roles (supports dot-notation, e.g. realm_access.roles) |
dvara.flightdeck.security.oidc.name-claim | name | JWT claim for user display name |
dvara.flightdeck.security.oidc.tenant-claim | tenant_id | JWT claim for tenant association |
dvara.flightdeck.security.rbac.enabled | true | Enable URL-pattern RBAC enforcement (false = authenticated() only) |
dvara.flightdeck.security.session.timeout-seconds | 3600 | Session timeout in seconds |
dvara.vault.* (Enterprise — Secrets Management)
Requires a valid DVARA license key.
| Property | Default | Description |
|---|---|---|
dvara.vault.backend | "" (disabled) | hashicorp, aws-secrets-manager, or azure-key-vault |
dvara.vault.cache-ttl-seconds | 300 | Secret cache TTL |
dvara.vault.hashicorp.address | — | HashiCorp Vault address |
dvara.vault.hashicorp.token | — | Static Vault token |
dvara.vault.hashicorp.auth-method | token | token or approle |
dvara.vault.hashicorp.secret-path | secret/data/meridian | KV v2 secret path |
dvara.vault.aws.region | us-east-1 | AWS region |
dvara.vault.aws.secret-name | meridian/provider-credentials | AWS Secrets Manager secret name |
dvara.vault.azure.vault-url | — | Azure Key Vault URL |
dvara.llm-gateway.tls.* (Enterprise — mTLS Per Provider)
Requires a valid DVARA license key.
| Property | Default | Description |
|---|---|---|
dvara.llm-gateway.tls.enforce-tls13 | true | Enforce TLS 1.3 on all outbound provider connections |
dvara.llm-gateway.tls.providers.<name>.mtls-enabled | false | Enable mTLS client cert for this provider |
dvara.llm-gateway.tls.providers.<name>.client-cert-path | — | Path to client certificate (PEM or PKCS12) |
dvara.llm-gateway.tls.providers.<name>.client-key-path | — | Path to client private key (PEM) |
dvara.llm-gateway.tls.providers.<name>.trust-store-path | — | Path to trust store |
dvara.llm-gateway.ip-access.* (Enterprise — IP Access Control)
Requires a valid DVARA license key.
| Property | Default | Description |
|---|---|---|
dvara.llm-gateway.ip-access.enabled | false | Enable IP allowlist/denylist |
dvara.llm-gateway.ip-access.scope | all | all or data-plane (only /v1/*) |
dvara.llm-gateway.ip-access.global-allowlist | [] | CIDR ranges to allow (e.g. 10.0.0.0/8) |
dvara.llm-gateway.ip-access.global-denylist | [] | CIDR ranges to deny |
Per-tenant: ip-access.allowlist and ip-access.denylist in tenant metadata (comma-separated CIDRs).
dvara.audit.* (Enterprise — Audit Trail)
Requires a valid DVARA license key.
| Property | Default | Description |
|---|---|---|
dvara.audit.hmac-secret | default-dev-secret-change-in-production | HMAC-SHA256 key for signing audit events |
dvara.audit.max-events | 100000 | Capacity of append-only audit store |
dvara.audit.store-prompts-by-default | false | Store prompt/response text in audit events |
dvara.audit.prompt-retention-days | 90 | Retention period for stored prompts |
dvara.flightdeck.audit-archive.* (Audit Archive Job)
Off-by-default scheduled job that archives older audit events out of PostgreSQL into object storage. Per-tenant overrides live in Tenant.metadata.
| Property | Default | Description |
|---|---|---|
dvara.flightdeck.audit-archive.enabled | false | Master switch. When false, the scheduled job is registered but no-ops. |
dvara.flightdeck.audit-archive.schedule | 0 0 2 * * ? | Spring cron — daily 02:00 UTC by default. |
dvara.flightdeck.audit-archive.retention-days | 180 | Events older than this in PostgreSQL are eligible for archive. Per-tenant override via Tenant.metadata["audit.archive.retention-days"]. |
dvara.flightdeck.audit-archive.grace-period-days | 7 | Extra days after a partition is VERIFIED in object storage before the corresponding PG rows are deleted. Safety window for restoring from PG if an object-storage issue surfaces post-upload. |
dvara.flightdeck.audit-archive.bucket | "" | Object-storage bucket name. Read from SPACES_BUCKET env var by the install scripts. |
dvara.flightdeck.audit-archive.endpoint | "" | S3-compatible endpoint (DigitalOcean Spaces, MinIO, AWS S3). Env: SPACES_ENDPOINT. |
dvara.flightdeck.audit-archive.region | "" | Object-storage region. Env: SPACES_REGION. |
dvara.flightdeck.audit-archive.access-key-id | "" | Object-storage access key. Env: SPACES_ACCESS_KEY_ID. |
dvara.flightdeck.audit-archive.secret-access-key | "" | Object-storage secret. Env: SPACES_SECRET_ACCESS_KEY. |
dvara.flightdeck.audit-archive.max-rows-per-archive | 1000000 | Safety cap on rows per single (tenant, date) partition. If a partition exceeds this, the job logs a warning and skips it — operator must split it manually before the next run. Raise if a tenant generates more than 1M audit events/day. |
dvara.llm-gateway.siem.* (Enterprise — SIEM Export)
| Property | Default | Description |
|---|---|---|
dvara.llm-gateway.siem.splunk.enabled | false | Enable Splunk HEC SIEM export |
dvara.llm-gateway.siem.splunk.hec-url | — | Splunk HEC endpoint URL |
dvara.llm-gateway.siem.splunk.token | — | HEC authentication token |
dvara.llm-gateway.siem.splunk.index | "" | Splunk index (blank = default) |
dvara.llm-gateway.siem.splunk.source | dvara-gateway | Splunk event source |
dvara.llm-gateway.siem.splunk.source-type | _json | Splunk source type |
dvara.llm-gateway.siem.splunk.timeout-seconds | 10 | HTTP timeout for the HEC POST. Failed posts are retried via the Kafka DLQ if Kafka is also configured; otherwise the audit event still lands in PostgreSQL (HEC export is best-effort). |
dvara.llm-gateway.siem.cloudwatch.enabled | false | Enable CloudWatch Logs SIEM export |
dvara.llm-gateway.siem.cloudwatch.log-group | — | CloudWatch log group name |
dvara.llm-gateway.siem.cloudwatch.log-stream | dvara-audit | CloudWatch log stream name |
dvara.llm-gateway.siem.cloudwatch.region | us-east-1 | AWS region |
dvara.llm-gateway.siem.cloudwatch.endpoint | — | Custom endpoint (blank = regional default https://logs.{region}.amazonaws.com) |
dvara.llm-gateway.siem.cloudwatch.batch-size | 25 | Flush events to CloudWatch in batches of this size — bounded by the CloudWatch PutLogEvents API limit of 10,000 events per call. |
dvara.llm-gateway.siem.cloudwatch.timeout-seconds | 10 | HTTP timeout for PutLogEvents calls. |
dvara.llm-gateway.siem.kafka.enabled | false | Enable Kafka SIEM export |
dvara.llm-gateway.siem.kafka.bootstrap-servers | — | Kafka broker addresses (e.g. localhost:9092) |
dvara.llm-gateway.siem.kafka.topic | dvara-audit | Kafka topic for audit events |
dvara.llm-gateway.siem.kafka.dead-letter-topic | "" | Dead-letter topic for failed publishes (blank = disabled) |
dvara.llm-gateway.siem.kafka.acks | all | Producer acknowledgment (all = survives broker failures) |
dvara.llm-gateway.siem.kafka.security-protocol | — | Security protocol (e.g. SASL_SSL) |
dvara.llm-gateway.siem.kafka.sasl-mechanism | — | SASL mechanism (e.g. PLAIN, SCRAM-SHA-256) |
dvara.llm-gateway.siem.kafka.sasl-jaas-config | — | JAAS config string for SASL auth |
dvara.mcp-gateway.agentic.* (Enterprise — Agentic Governance)
Requires a valid DVARA license key.
| Property | Default | Description |
|---|---|---|
dvara.mcp-gateway.agentic.enabled | true | Enable agentic governance features |
dvara.mcp-gateway.agentic.session-ttl-minutes | 60 | Active session TTL |
dvara.mcp-gateway.agentic.session-max-capacity | 10000 | Max tracked sessions |
dvara.mcp-gateway.agentic.loop-detection.enabled | true | Enable loop detection |
dvara.mcp-gateway.agentic.loop-detection.repetition-threshold | 5 | Consecutive same-tool calls to trigger |
dvara.mcp-gateway.agentic.loop-detection.max-calls-per-minute | 60 | Rate limit threshold |
dvara.mcp-gateway.agentic.loop-detection.auto-kill | false | Auto-kill session on loop detection |
dvara.mcp-gateway.agentic.approval.enabled | true | Enable human-in-the-loop approval gates |
dvara.mcp-gateway.agentic.approval.timeout-seconds | 300 | Approval wait timeout |
dvara.mcp-gateway.agentic.approval.default-action | deny | Action on approval timeout: deny or approve |
dvara.mcp-gateway.agentic.approval.max-pending-approvals | 1000 | Max concurrent pending approvals |
dvara.mcp-gateway.agentic.loop-detection.cycle-max-length | 4 | Max cycle pattern length |
dvara.mcp-gateway.agentic.loop-detection.cycle-repetitions | 3 | Required cycle repetitions to trigger |
dvara.mcp-gateway.agentic.loop-detection.history-size | 100 | Per-session tool-call history buffer size |
dvara.flightdeck.security.builtin.* (Built-in Email/Password Auth)
Active when dvara.flightdeck.security.enabled=true and no OIDC issuer-uri or SAML metadata-url is set. Provides form login, invite-only user creation, password reset, and personal access tokens.
| Property | Default | Description |
|---|---|---|
dvara.flightdeck.security.builtin.session-timeout-minutes | 60 | Form-login session timeout |
dvara.flightdeck.security.builtin.max-failed-attempts | 5 | Consecutive failed logins before lockout |
dvara.flightdeck.security.builtin.lockout-duration-minutes | 15 | Lockout duration after max failed attempts |
dvara.flightdeck.security.builtin.base-url | http://localhost:8090 | Base URL for invitation and password-reset email links |
dvara.flightdeck.security.builtin.pat.default-max-ttl-days | 365 | Platform-wide ceiling on PAT (personal access token) TTL. A per-tenant override lives in Tenant.metadata["pat.max-ttl-days"] and narrows below this ceiling. The absolute hard cap is 365 regardless of configuration — higher values are clamped at PAT creation time. PAT expiry is required (non-expiring PATs are not allowed). |
Email delivery configuration has been consolidated under the new dvara.flightdeck.email.* namespace — see Email Configuration below.
dvara.flightdeck.email.* (Consolidated Email Configuration)
The Flightdeck pod is the only one that ships actual sends. Producers in any module publish an EmailRequestedEvent; the EmailDeliveryListener in Flightdeck renders the template and dispatches via the configured transport. URL fields live in the cross-cutting EmailProperties; transport + durability fields live in FlightdeckEmailProperties — both bind to the same dvara.flightdeck.email.* prefix.
URL + transport fields (consumed by producers and the transport tier):
| Property | Env Var | Default | Description |
|---|---|---|---|
dvara.flightdeck.email.from | DVARA_FLIGHTDECK_EMAIL_FROM | noreply@dvarahq.com | Sender address — From: header on every outbound email |
dvara.flightdeck.email.transport | DVARA_FLIGHTDECK_EMAIL_TRANSPORT | log | log prints to console (dev / CI); smtp uses Spring's JavaMailSender; resend POSTs to the Resend transactional API |
dvara.flightdeck.email.public-endpoint-url | DVARA_FLIGHTDECK_EMAIL_PUBLIC_ENDPOINT_URL | https://api.dvarahq.com/v1 | Data-plane URL shown in the welcome email and /signup/check-email |
dvara.flightdeck.email.flightdeck-url | DVARA_FLIGHTDECK_EMAIL_FLIGHTDECK_URL | https://flightdeck.dvarahq.com | Console base URL for welcome / password-reset CTAs |
dvara.flightdeck.email.docs-url | DVARA_FLIGHTDECK_EMAIL_DOCS_URL | https://dvarahq.com/docs | Docs link in the welcome email |
dvara.flightdeck.email.resend-api-key | DVARA_FLIGHTDECK_EMAIL_RESEND_API_KEY | — | Resend API key (re_…); only used when transport=resend. The sender domain in email.from must be verified in Resend (or use the sandbox sender onboarding@resend.dev). |
dvara.flightdeck.email.resend.verify-domain-at-startup | — | true | When transport=resend, the boot probe calls GET /domains and refuses to start on a production-class profile if the sender domain isn't verified. Skipped automatically for the sandbox sender. Disable on air-gapped / no-egress environments. |
Durability + retry + DLQ + idempotency knobs (dvara.flightdeck.email.delivery.*):
| Property | Default | Description |
|---|---|---|
delivery.enabled | true | Master switch for the durability layer. false → listener falls back to fire-and-forget — no email_delivery_log row, no idempotency dedup, no retry, no DLQ. Useful for tests that don't want a Postgres dep. |
delivery.max-attempts | 5 | Sync attempt 1 + 4 async retries before the row is marked DEAD_LETTERED |
delivery.initial-backoff-ms | 30000 | Backoff before attempt 2 |
delivery.max-backoff-ms | 120000 | Ceiling on any single retry's backoff |
delivery.backoff-multiplier | 2.0 | Exponential factor — delay(n) = min(initial × multiplier^(n-2), max) for n ≥ 2. With defaults: 0 / +30s / +60s / +120s / +120s → cumulative ~5m 30s before DLQ. |
delivery.retry-sweep-interval-ms | 30000 | How often the EmailRetrySweeper polls the DB for due retries |
delivery.retry-sweep-batch-size | 100 | Max rows the sweeper processes per tick |
delivery.idempotency-ttl-minutes | 60 | Dedup window. A second publish of the same EmailRequestedEvent.idempotencyKey within this window is a no-op. Producers reusing deterministic UUIDs across dlq-retention-days should regenerate to avoid losing the audit row on PK collision. |
delivery.dlq-retention-days | 30 | How long SENT + DEAD_LETTERED rows are kept for operator review |
delivery.cleanup-cron | 0 0 3 * * * | Daily 03:00 UTC sweep that drops terminal rows past dlq-retention-days. PENDING_RETRY rows are never touched. |
The legacy dvara.flightdeck.email.resend.retry-max-attempts / .retry-initial-backoff-ms / .retry-max-backoff-ms properties on the Resend transport are now defunct (retry lives at the listener level since PR 4 of #835) and kept for one release with @Deprecated. Operators should migrate to the dvara.flightdeck.email.delivery.* namespace.
dvara.flightdeck.security.saml.* (SAML 2.0 SSO)
Active when dvara.flightdeck.security.saml.metadata-url is set. Mutually exclusive with OIDC — configuring both fails at startup.
| Property | Env Var | Default | Description |
|---|---|---|---|
dvara.flightdeck.security.saml.metadata-url | DVARA_FLIGHTDECK_SAML_METADATA_URL | — | IdP metadata XML URL (activates SAML mode) |
dvara.flightdeck.security.saml.entity-id | DVARA_FLIGHTDECK_SAML_ENTITY_ID | dvara | SP entity ID |
dvara.flightdeck.security.saml.registration-id | DVARA_FLIGHTDECK_SAML_REGISTRATION_ID | dvara | SAML registration ID used in the ACS and metadata URLs (/login/saml2/sso/{registrationId}) |
dvara.flightdeck.security.saml.role-attribute | DVARA_FLIGHTDECK_SAML_ROLE_ATTRIBUTE | roles | SAML attribute containing roles |
dvara.flightdeck.security.saml.email-attribute | DVARA_FLIGHTDECK_SAML_EMAIL_ATTRIBUTE | NameID | SAML attribute for email (NameID uses the SAML NameID element) |
dvara.flightdeck.security.saml.tenant-attribute | DVARA_FLIGHTDECK_SAML_TENANT_ATTRIBUTE | tenant_id | SAML attribute for tenant ID |
dvara.flightdeck.security.saml.name-attribute | DVARA_FLIGHTDECK_SAML_NAME_ATTRIBUTE | displayName | SAML attribute for display name |
dvara.flightdeck.security.saml.auto-provision | DVARA_FLIGHTDECK_SAML_AUTO_PROVISION | false | Auto-create users on first SAML login |
dvara.flightdeck.security.saml.default-roles | DVARA_FLIGHTDECK_SAML_DEFAULT_ROLES | viewer | Comma-separated roles for auto-provisioned users |
dvara.flightdeck.security.saml.default-tenant-id | DVARA_FLIGHTDECK_SAML_DEFAULT_TENANT_ID | — | Tenant ID for auto-provisioned users (required when auto-provisioning is on) |
dvara.vault.hashicorp.* / .aws.* / .azure.* (Full Vault Config)
The basic dvara.vault.backend selector and cache TTL are in the earlier Vault section. Backend-specific sub-fields:
| Property | Env Var | Default | Description |
|---|---|---|---|
dvara.vault.hashicorp.namespace | VAULT_NAMESPACE | — | Vault Enterprise namespace (optional) |
dvara.vault.hashicorp.role-id | VAULT_ROLE_ID | — | AppRole role ID |
dvara.vault.hashicorp.secret-id | VAULT_SECRET_ID | — | AppRole secret ID |
dvara.vault.aws.access-key | AWS_VAULT_ACCESS_KEY | — | Optional. When blank, the AWS SDK's default credential provider chain resolves credentials in order: env vars (AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY), Java system props, the EC2 instance profile, the ECS task role, the EKS IRSA web-identity token (AWS_WEB_IDENTITY_TOKEN_FILE), and the ~/.aws/credentials file. For Kubernetes deployments, prefer IRSA over static keys — leave both fields blank and attach the IAM role to the pod's ServiceAccount. |
dvara.vault.aws.secret-key | AWS_VAULT_SECRET_KEY | — | Optional. Same fallback chain as access-key — leave blank to use instance / IRSA / SDK-default credentials. |
dvara.vault.azure.client-id | AZURE_CLIENT_ID | — | Azure AD application client ID |
dvara.vault.azure.client-secret | AZURE_CLIENT_SECRET | — | Azure AD application client secret |
dvara.vault.azure.tenant-id | AZURE_TENANT_ID | — | Azure AD tenant ID |
dvara.llm-gateway.tls.providers.<name>.* (Full mTLS Config)
| Property | Default | Description |
|---|---|---|
client-key-password | — | Password for client key or PKCS12 bundle |
trust-store-password | — | Trust store password (if encrypted) |
store-type | auto | Auto-detected from extension: .p12→PKCS12, .jks→JKS, .pem→PEM |
dvara.flightdeck.compliance.* (Scheduled Compliance Reports)
Requires a valid DVARA license key.
| Property | Env Var | Default | Description |
|---|---|---|---|
dvara.flightdeck.compliance.soc2-schedule | DVARA_FLIGHTDECK_COMPLIANCE_SOC2_SCHEDULE | "" | Cron expression for SOC2 auto-generation (blank = disabled) |
dvara.flightdeck.compliance.hipaa-schedule | DVARA_FLIGHTDECK_COMPLIANCE_HIPAA_SCHEDULE | "" | Cron expression for HIPAA auto-generation |
dvara.flightdeck.compliance.gdpr-schedule | DVARA_FLIGHTDECK_COMPLIANCE_GDPR_SCHEDULE | "" | Cron expression for GDPR auto-generation |
dvara.flightdeck.compliance.default-tenant-id | DVARA_FLIGHTDECK_COMPLIANCE_DEFAULT_TENANT | — | Tenant ID for scheduled reports (blank = all tenants) |
dvara.flightdeck.compliance.retention-days | DVARA_FLIGHTDECK_COMPLIANCE_RETENTION_DAYS | 365 | Retention for generated reports |
dvara.flightdeck.portal.* (Tenant Self-Service Portal)
| Property | Default | Description |
|---|---|---|
dvara.flightdeck.portal.enabled | true | Enable /portal/* routes for tenant users |
dvara.flightdeck.portal.self-service | true | Allow tenants to create API keys via the portal |
Flightdeck GitOps Config Import — Upload Cap
The Flightdeck Config Import surface (POST /config/import/preview) is protected by two layered byte caps. Sized for fleet snapshots — a 10K-tenant export is well under 5 MB — and small enough that a single tampered upload can't OOM the JVM.
| Property | Default | Description |
|---|---|---|
spring.servlet.multipart.max-file-size | 25MB | Tomcat-level cap on the uploaded .json snapshot. A MaxUploadSizeExceededException handler turns Tomcat's bare 413 into a friendly redirect-with-flash on /config/import. |
spring.servlet.multipart.max-request-size | 25MB | Mirrors max-file-size so a multipart envelope with extra form fields can't sneak past. |
ConfigUiController re-checks the same byte cap in-process for both file uploads and pasted JSON, so the cap applies uniformly regardless of input method.
dvara.region.* (Multi-Region Identity)
| Property | Env Var | Description |
|---|---|---|
dvara.region.id | DVARA_REGION_ID | Region identifier used in data-residency routing |
dvara.region.name | DVARA_REGION_NAME | Human-readable region name |
dvara.encryption.* (AES-256-GCM Master Password)
| Property | Env Var | Description |
|---|---|---|
dvara.encryption.master-password | DVARA_ENCRYPTION_MASTER_PASSWORD | Master password used to derive AES-256-GCM keys for provider credentials and the PII token store. Required when storing credentials through /v1/admin/credentials. |
See Credentials & BYOK for rotation guidance.
MCP Proxy Observability Config
For the standalone DVARA MCP Proxy (port 8070), in addition to the existing MCP timeout properties:
| Property | Default | Description |
|---|---|---|
management.endpoints.web.exposure.include | health,prometheus | Actuator endpoints to expose |
management.prometheus.metrics.export.enabled | true | Enable Prometheus scrape endpoint |
management.tracing.sampling.probability | 1.0 | OTel sampling probability (env: TRACING_SAMPLING_PROBABILITY) |
management.otlp.tracing.endpoint | http://localhost:4318/v1/traces | OTLP trace exporter endpoint (env: OTEL_EXPORTER_OTLP_ENDPOINT) |
dvara.mcp-gateway.agentic.approval.default-action | deny | Action on timeout: deny or approve |