Kubernetes Deployment
Deploy DVARA on Kubernetes with the official Helm chart.
Prerequisites
- Kubernetes 1.28+
- Helm 3.x
kubectlconfigured for your cluster- A reachable PostgreSQL 14+ instance (the chart does NOT bundle PostgreSQL — see Common Deployment Patterns below for how to wire it)
- A DVARA license envelope (
DVARA-…prefix). The gateway refuses to start without one; there is no operator-flippable bypass.
Quick Start
Mint the required secrets first — every DVARA install needs four chart-managed secrets plus the audit HMAC (passed via extraEnv because the chart doesn't wire it yet):
# 4 chart-managed secrets
export DVARA_LICENSE_KEY="DVARA-…" # from your trial / contract email
export ACTUATOR_API_KEY=$(openssl rand -base64 32) # operator Bearer for /actuator/gateway-status
export METRICS_API_KEY=$(openssl rand -base64 32) # DISTINCT Bearer for /actuator/prometheus
export ENCRYPTION_PASSWORD=$(openssl rand -base64 32) # AES-256-GCM key for ENCRYPTED-mode credentials
export AUDIT_HMAC=$(openssl rand -base64 32) # signs audit-chain envelopes
From OCI Registry (recommended)
Pre-built images and the Helm chart are published to GitHub Container Registry:
helm install dvara oci://ghcr.io/dvarahq/dvara/charts/meridian \
--set secrets.enterpriseLicenseKey="$DVARA_LICENSE_KEY" \
--set secrets.gatewayServerApiKey="$ACTUATOR_API_KEY" \
--set secrets.gatewayMetricsApiKey="$METRICS_API_KEY" \
--set secrets.gatewayEncryptionMasterPassword="$ENCRYPTION_PASSWORD" \
--set-string "gatewayServer.extraEnv[0].name=DVARA_AUDIT_HMAC_SECRET" \
--set-string "gatewayServer.extraEnv[0].value=$AUDIT_HMAC" \
--set secrets.mockProviderEnabled=true
# Wait for pods to be ready
kubectl rollout status deployment/dvara-server
kubectl rollout status deployment/dvara-ui
# Verify
kubectl port-forward svc/dvara-server 8080:8080 &
curl http://localhost:8080/actuator/health
From Local Chart
Same secret arguments, pointing at a local chart directory:
helm install dvara charts/meridian/ \
--set secrets.enterpriseLicenseKey="$DVARA_LICENSE_KEY" \
--set secrets.gatewayServerApiKey="$ACTUATOR_API_KEY" \
--set secrets.gatewayMetricsApiKey="$METRICS_API_KEY" \
--set secrets.gatewayEncryptionMasterPassword="$ENCRYPTION_PASSWORD" \
--set-string "gatewayServer.extraEnv[0].name=DVARA_AUDIT_HMAC_SECRET" \
--set-string "gatewayServer.extraEnv[0].value=$AUDIT_HMAC" \
--set secrets.mockProviderEnabled=true
For any deployment past initial smoke-testing, move these into a values file (or an externally-managed Secret via secrets.existingSecret) rather than passing on the command line.
The gateway pod fails startup if enterpriseLicenseKey is unset (LicenseValidationException: No license key configured). The actuator chain returns 401 on every authenticated endpoint if the two API keys are unset. ENCRYPTED-mode credential persistence fails if the encryption master password is unset. The audit chain refuses to write on a production-class profile if the HMAC secret is unset or carries the default placeholder.
Installing with Provider Keys
helm install dvara charts/meridian/ \
--set secrets.providerKeys.openai=sk-... \
--set secrets.providerKeys.anthropic=sk-ant-...
Or use a values file:
# my-values.yaml
secrets:
providerKeys:
openai: sk-...
anthropic: sk-ant-...
gemini: AIza...
helm install dvara charts/meridian/ -f my-values.yaml
Configuration Reference
Gateway Server
| Parameter | Description | Default |
|---|---|---|
gatewayServer.enabled | Enable gateway-server | true |
gatewayServer.replicaCount | Replicas (ignored when HPA enabled) | 1 |
gatewayServer.image.repository | Image repository | ghcr.io/dvarahq/dvara/dvara-llm-gateway |
gatewayServer.image.tag | Image tag (defaults to chart appVersion) | "" |
gatewayServer.gatewayMode | Operating mode: standalone or full | full |
gatewayServer.region.id | Region identity for multi-region deployments | "" |
gatewayServer.region.name | Human-readable region name | "" |
gatewayServer.javaOpts | JVM options | "" |
gatewayServer.resources.requests.cpu | CPU request | 250m |
gatewayServer.resources.requests.memory | Memory request | 512Mi |
gatewayServer.resources.limits.cpu | CPU limit | 2 |
gatewayServer.resources.limits.memory | Memory limit | 1Gi |
gatewayServer.service.type | Service type | ClusterIP |
gatewayServer.service.port | Service port | 8080 |
Gateway UI (DVARA Flightdeck)
The Helm chart's gatewayUi.* parameter family configures the DVARA Flightdeck pod (Console + tenant Portal + Automation API). The parameter prefix is a holdover from when the product was simply called "Gateway UI"; the deployed image is ghcr.io/dvarahq/dvara/dvara-flightdeck.
| Parameter | Description | Default |
|---|---|---|
gatewayUi.enabled | Enable gateway-ui | true |
gatewayUi.replicaCount | Number of replicas | 1 |
gatewayUi.image.repository | Image repository | ghcr.io/dvarahq/dvara/dvara-flightdeck |
gatewayUi.gatewayServerUrl | Override auto-discovered server URL | "" (auto) |
gatewayUi.resources.requests.cpu | CPU request | 100m |
gatewayUi.resources.requests.memory | Memory request | 256Mi |
gatewayUi.service.type | Service type | ClusterIP |
gatewayUi.service.port | Service port | 8090 |
Gateway UI Health Probes:
| Probe | Path | Purpose |
|---|---|---|
| Liveness | /actuator/health/liveness | Basic liveness check |
| Readiness | /actuator/health/readiness | Includes controlPlane check (gateway-server connectivity) |
| Startup | /actuator/health/liveness | Allows JVM warmup before liveness kicks in |
All four probe paths (/actuator/health, /actuator/health/liveness, /actuator/health/readiness, /actuator/info) are anonymous by design so k8s probes work without secrets. Do not set management.endpoint.health.show-details=always — the gateway refuses to start in that mode so anonymous callers cannot read per-indicator JSON like cache-cluster state or database pool internals. Leave the default when-authorized.
Secrets
| Parameter | Description | Default |
|---|---|---|
secrets.create | Create the Secret resource | true |
secrets.existingSecret | Use an existing Secret instead | "" |
secrets.providerKeys.openai | OpenAI API key | "" |
secrets.providerKeys.anthropic | Anthropic API key | "" |
secrets.providerKeys.gemini | Gemini API key | "" |
secrets.providerKeys.awsAccessKeyId | AWS access key (Bedrock) | "" |
secrets.providerKeys.awsSecretAccessKey | AWS secret key (Bedrock) | "" |
secrets.ollamaEnabled | Enable Ollama provider | "" |
secrets.ollamaBaseUrl | Ollama base URL | "" |
secrets.bedrockEnabled | Enable Bedrock provider | "" |
secrets.mockProviderEnabled | Enable mock provider | "" |
secrets.gatewayInternalSecret | Shared secret for /internal/* | "" |
secrets.gatewayEncryptionMasterPassword | AES-256-GCM master password for ENC: values + ENCRYPTED-mode provider credentials | "" |
secrets.gatewayServerApiKey | Operator Bearer for /actuator/gateway-status + every authenticated /actuator/* path EXCEPT prometheus. Generate with openssl rand -base64 32. Required — every authenticated actuator probe 401s without it. | "" |
secrets.gatewayMetricsApiKey | Distinct Bearer for /actuator/prometheus only. Must differ from gatewayServerApiKey (principle of least privilege — a leaked scrape token must not unlock the license envelope). Required. | "" |
secrets.enterpriseLicenseKey | DVARA license envelope (DVARA-… prefix, Ed25519-signed). Required at startup for every DVARA process (gateway-server, flightdeck, mcp-proxy-server) — LicenseEnvironmentPostProcessor refuses to boot without it; no operator-flippable bypass. | "" |
DVARA_AUDIT_HMAC_SECRET signs audit-chain envelopes and is required on any production-class Spring profile, but the chart does not wire it through secrets.* yet. Pass it via gatewayServer.extraEnv (see the Quick Start example above) or through secrets.existingSecret with the right key.
MCP Proxy Server
| Parameter | Description | Default |
|---|---|---|
mcpProxyServer.enabled | Enable MCP Proxy (requires enterprise license) | false |
mcpProxyServer.replicaCount | Replicas (ignored when HPA enabled) | 2 |
mcpProxyServer.image.repository | Image repository | ghcr.io/dvarahq/dvara/dvara-mcp-gateway |
mcpProxyServer.image.tag | Image tag (defaults to chart appVersion) | "" |
mcpProxyServer.javaOpts | JVM options | "" |
mcpProxyServer.gatewayServerUrl | Override auto-discovered server URL | "" (auto) |
mcpProxyServer.resources.requests.cpu | CPU request | 250m |
mcpProxyServer.resources.requests.memory | Memory request | 512Mi |
mcpProxyServer.resources.limits.cpu | CPU limit | 2 |
mcpProxyServer.resources.limits.memory | Memory limit | 1Gi |
mcpProxyServer.service.type | Service type (internal-only) | ClusterIP |
mcpProxyServer.service.port | Service port | 8070 |
mcpProxyServer.autoscaling.enabled | Enable HPA | false |
mcpProxyServer.autoscaling.minReplicas | Minimum replicas | 2 |
mcpProxyServer.autoscaling.maxReplicas | Maximum replicas | 20 |
mcpProxyServer.pdb.enabled | Enable PDB | false |
mcpProxyServer.pdb.minAvailable | Min available pods | 1 |
mcpProxyServer.networkPolicy.enabled | Enable NetworkPolicy (deny external, allow gateway-server) | false |
mcpProxyServer.serviceMonitor.enabled | Enable Prometheus ServiceMonitor | false |
The MCP Proxy is deployed as an internal-only ClusterIP service. It should be accessed by the gateway-server, not exposed externally. When networkPolicy.enabled is true, only pods matching the gateway-server selector labels can reach the MCP Proxy.
Ingress
| Parameter | Description | Default |
|---|---|---|
ingress.enabled | Enable Ingress | false |
ingress.className | Ingress class (nginx, traefik, alb) | "" |
ingress.annotations | Ingress annotations | {} |
ingress.gatewayServer.hosts | Server host/path rules | [{host: gateway.example.com}] |
ingress.gatewayServer.tls | Server TLS config | [] |
ingress.gatewayUi.hosts | UI host/path rules | [{host: admin.example.com}] |
ingress.gatewayUi.tls | UI TLS config | [] |
Graceful Shutdown & Rolling Updates
| Parameter | Description | Default |
|---|---|---|
gatewayServer.terminationGracePeriodSeconds | Pod termination grace period (must exceed preStop + drain) | 45 |
gatewayServer.preStopSleepSeconds | Sleep before SIGTERM (endpoint de-registration propagation) | 5 |
gatewayServer.rollingUpdate.maxSurge | Max extra pods during rolling update | 1 |
gatewayServer.rollingUpdate.maxUnavailable | Max unavailable pods during rolling update (0 = zero-downtime) | 0 |
gatewayServer.topologySpreadConstraints | Topology spread for cross-zone scheduling | [] |
The default configuration ensures zero-downtime rolling updates: maxSurge: 1 creates one new pod before terminating old ones, and maxUnavailable: 0 ensures at least N pods are always ready. The preStopSleepSeconds delay allows Kubernetes endpoint propagation to complete before the application receives SIGTERM and begins its 30-second graceful drain.
Autoscaling (HPA)
| Parameter | Description | Default |
|---|---|---|
gatewayServer.autoscaling.enabled | Enable HPA | false |
gatewayServer.autoscaling.minReplicas | Minimum replicas | 2 |
gatewayServer.autoscaling.maxReplicas | Maximum replicas | 10 |
gatewayServer.autoscaling.targetCPUUtilizationPercentage | CPU target | 70 |
gatewayServer.autoscaling.targetMemoryUtilizationPercentage | Memory target | 80 |
gatewayServer.autoscaling.behavior.scaleUp.stabilizationWindowSeconds | Wait before scaling up | 30 |
gatewayServer.autoscaling.behavior.scaleDown.stabilizationWindowSeconds | Wait before scaling down | 300 |
The default HPA behavior scales up quickly (50% per minute after 30s stabilization) but scales down conservatively (25% per 2 minutes after 5-minute stabilization) to prevent flapping.
Pod Disruption Budget
| Parameter | Description | Default |
|---|---|---|
gatewayServer.pdb.enabled | Enable PDB | false |
gatewayServer.pdb.minAvailable | Min available pods | 1 |
gatewayServer.pdb.maxUnavailable | Max unavailable pods | "" |
Prometheus ServiceMonitor
| Parameter | Description | Default |
|---|---|---|
gatewayServer.serviceMonitor.enabled | Enable (requires Prometheus Operator) | false |
gatewayServer.serviceMonitor.interval | Scrape interval | 30s |
gatewayServer.serviceMonitor.path | Metrics path | /actuator/prometheus |
gatewayServer.serviceMonitor.additionalLabels | Labels for monitor selection | {} |
When gatewayServer.serviceMonitor.enabled=true, configure the ServiceMonitor's bearerTokenFile (or the Helm equivalent) to point at a file containing the DVARA_ACTUATOR_METRICS_API_KEY value. /actuator/prometheus is authenticated — without the token, every scrape returns 401 and the time series goes dark. The metrics secret is intentionally distinct from DVARA_ACTUATOR_API_KEY so a leaked scrape token can't unlock the rich gateway status surface.
Clustering on Kubernetes
DVARA LLM Gateway instances share rate-limit counters and API key lookups across the fleet. On Kubernetes, pods must discover each other via a headless Service — multicast is unavailable in most clusters.
When KUBERNETES_NAMESPACE is set (the downward API auto-injects this), the gateway requires CACHE_SERVICE_NAME to point at a headless Service fronting the gateway pods. Without it, startup fails with:
Kubernetes clustering requires CACHE_SERVICE_NAME when
KUBERNETES_NAMESPACE is set. Without it, pods cannot form a cluster and
rate limit state will not be shared.
The official Helm chart wires both variables automatically when you deploy multiple gateway replicas — you don't need the manual YAML below unless you're bypassing the chart. Outside Kubernetes (local docker compose, bare metal), gateway instances auto-discover each other via multicast and CACHE_SERVICE_NAME is not required.
Headless Service pattern (reference)
apiVersion: v1
kind: Service
metadata:
name: dvara-server-cluster
labels:
app.kubernetes.io/name: dvara-server
spec:
clusterIP: None # headless — each pod gets a DNS A record
publishNotReadyAddresses: true
selector:
app.kubernetes.io/name: dvara-server
ports:
- name: cluster
port: 5701
targetPort: 5701
Deployment environment variables (reference)
apiVersion: apps/v1
kind: Deployment
metadata:
name: dvara-server
spec:
template:
spec:
containers:
- name: gateway-server
env:
- name: KUBERNETES_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: CACHE_SERVICE_NAME
value: dvara-server-cluster
Common Deployment Patterns
Minimal Mode (Testing)
PostgreSQL is required even for minimal deployments — there is no in-memory fallback. For quick testing, point the chart at an external Postgres or deploy a small Postgres StatefulSet alongside the gateway:
helm install dvara charts/meridian/ \
--set secrets.mockProviderEnabled=true \
--set gatewayServer.env.SPRING_DATASOURCE_URL=jdbc:postgresql://postgres:5432/dvara \
--set gatewayServer.env.SPRING_DATASOURCE_USERNAME=dvara \
--set gatewayServer.env.SPRING_DATASOURCE_PASSWORD=dvara
Production with OpenAI
# production-values.yaml
gatewayServer:
replicaCount: 3
gatewayMode: standalone
# Match heap to container limits below — leave ~25% of the limit for
# off-heap (Metaspace, native code, JIT, kernel buffers). With memory
# limit = 2Gi, -Xmx1500m is a safe upper bound; reserve more if you
# see Metaspace pressure in JFR.
javaOpts: "-Xms1g -Xmx1500m"
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: "2"
memory: 2Gi
terminationGracePeriodSeconds: 45
preStopSleepSeconds: 5
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
pdb:
enabled: true
minAvailable: 2
secrets:
providerKeys:
openai: sk-...
ingress:
enabled: true
className: nginx
gatewayServer:
hosts:
- host: gateway.mycompany.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: gateway-tls
hosts:
- gateway.mycompany.com
helm install dvara charts/meridian/ -f production-values.yaml
Enterprise with MCP Proxy
Deploy the gateway with the MCP Proxy for agent tool governance:
# enterprise-mcp-values.yaml
gatewayServer:
replicaCount: 3
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
mcpProxyServer:
enabled: true
replicaCount: 2
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 20
networkPolicy:
enabled: true
pdb:
enabled: true
secrets:
providerKeys:
openai: sk-...
enterpriseLicenseKey: DVARA-... # signed DVARA license key
helm install dvara charts/meridian/ -f enterprise-mcp-values.yaml
Access the MCP Proxy via port-forward (internal-only, not exposed via Ingress):
kubectl port-forward svc/dvara-mcp-gateway 8070:8070
curl http://localhost:8070/actuator/health
Inline Gateway Configuration
Pass a gateway.yaml configuration directly via Helm values:
gatewayServer:
gatewayConfig:
routing:
default-strategy: round-robin
rate-limit:
enabled: true
per-key:
requests-per-minute: 100
This creates a ConfigMap mounted into the gateway pod and applied as Spring Boot externalized configuration.
Using External Secrets
If you manage secrets with External Secrets Operator, Sealed Secrets, or a vault:
secrets:
create: false
existingSecret: my-external-secret
The existing Secret must contain the same keys: openai-api-key, anthropic-api-key, gemini-api-key, aws-access-key-id, aws-secret-access-key, ollama-enabled, ollama-base-url, bedrock-enabled, mock-provider-enabled, gateway-internal-secret, gateway-encryption-master-password, enterprise-license-key (signed).
AWS Bedrock with IRSA
gatewayServer:
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/dvara-bedrock
secrets:
bedrockEnabled: "true"
Multi-Region Deployment
Deploy an instance in a specific region:
# us-east-values.yaml
gatewayServer:
gatewayMode: full
region:
id: us-east-1
name: US East
secrets:
providerKeys:
openai: sk-...
helm install dvara-us-east charts/meridian/ -f us-east-values.yaml
When region.id is set, the chart adds DVARA_REGION_ID and DVARA_REGION_NAME environment variables (which Spring relaxed-binds to the dvara.region.id / dvara.region.name properties on the RegionContext bean) and a meridian.ai/region pod label for topology-aware scheduling.
The chart's gateway-server/deployment.yaml currently emits these as DVARA_LLM_GATEWAY_REGION_ID / DVARA_LLM_GATEWAY_REGION_NAME — Spring relaxed binding doesn't match those names against the dvara.region.* property prefix, so the values are set but never read. Tracked as part of #868. If you're hitting region-aware routing surprises and you set region.id via the chart, pass DVARA_REGION_ID / DVARA_REGION_NAME through gatewayServer.extraEnv as a workaround until the chart is fixed.
Prometheus Monitoring
gatewayServer:
serviceMonitor:
enabled: true
interval: 15s
additionalLabels:
release: prometheus-stack
The ServiceMonitor scrapes /actuator/prometheus. Requires the Prometheus Operator CRDs to be installed and a bearerTokenFile (or Helm-managed equivalent) pointing at the DVARA_ACTUATOR_METRICS_API_KEY value — see the ServiceMonitor parameter table above.
Security
The chart applies security hardening by default:
- Non-root execution — Pods run as UID 1001 (
runAsNonRoot: true) - Read-only filesystem —
readOnlyRootFilesystem: truewith a/tmpemptyDir for JVM temp files - No privilege escalation —
allowPrivilegeEscalation: false - Capabilities dropped — All Linux capabilities dropped
- No service account token —
automountServiceAccountToken: false(no K8s API access needed) - Secret key refs optional — Pods start even if only some provider keys are configured
Upgrading
# From OCI registry
helm upgrade dvara oci://ghcr.io/dvarahq/dvara/charts/meridian -f my-values.yaml
# From local chart
helm upgrade dvara charts/meridian/ -f my-values.yaml
Pods automatically restart when secrets or ConfigMap content changes (via checksum annotations on the pod template).
Running Helm Tests
helm test dvara
This runs test pods that verify gateway-server, gateway-ui, and (if enabled) mcp-proxy-server services are reachable.
Uninstalling
helm uninstall dvara
Troubleshooting
Pods stuck in CrashLoopBackOff
Check logs for JVM startup errors:
kubectl logs deployment/dvara-server
Common causes:
- Insufficient memory — increase
resources.limits.memory - Missing secret keys — verify the Secret exists:
kubectl get secret dvara -o yaml
Startup probe fails
The startup probe allows 60 seconds (5s initial + 12 retries x 5s) for JVM warmup. If your image is large or the node is slow, increase the startup probe:
gatewayServer:
startupProbe:
failureThreshold: 20
Services not reachable
# Check pod status
kubectl get pods -l app.kubernetes.io/component=gateway-server
# Check service endpoints
kubectl get endpoints dvara-server
# Port-forward to test directly
kubectl port-forward svc/dvara-server 8080:8080
curl http://localhost:8080/actuator/health
Providers not registering
Provider keys must be non-empty strings. Check the Secret:
kubectl get secret dvara -o jsonpath='{.data.openai-api-key}' | base64 -d
Empty string = provider disabled (this is expected for unused providers).