Skip to main content

Kubernetes Deployment

Deploy DVARA on Kubernetes with the official Helm chart.

Prerequisites

  • Kubernetes 1.28+
  • Helm 3.x
  • kubectl configured for your cluster
  • A reachable PostgreSQL 14+ instance (the chart does NOT bundle PostgreSQL — see Common Deployment Patterns below for how to wire it)
  • A DVARA license envelope (DVARA-… prefix). The gateway refuses to start without one; there is no operator-flippable bypass.

Quick Start

Mint the required secrets first — every DVARA install needs four chart-managed secrets plus the audit HMAC (passed via extraEnv because the chart doesn't wire it yet):

# 4 chart-managed secrets
export DVARA_LICENSE_KEY="DVARA-…" # from your trial / contract email
export ACTUATOR_API_KEY=$(openssl rand -base64 32) # operator Bearer for /actuator/gateway-status
export METRICS_API_KEY=$(openssl rand -base64 32) # DISTINCT Bearer for /actuator/prometheus
export ENCRYPTION_PASSWORD=$(openssl rand -base64 32) # AES-256-GCM key for ENCRYPTED-mode credentials
export AUDIT_HMAC=$(openssl rand -base64 32) # signs audit-chain envelopes

Pre-built images and the Helm chart are published to GitHub Container Registry:

helm install dvara oci://ghcr.io/dvarahq/dvara/charts/meridian \
--set secrets.enterpriseLicenseKey="$DVARA_LICENSE_KEY" \
--set secrets.gatewayServerApiKey="$ACTUATOR_API_KEY" \
--set secrets.gatewayMetricsApiKey="$METRICS_API_KEY" \
--set secrets.gatewayEncryptionMasterPassword="$ENCRYPTION_PASSWORD" \
--set-string "gatewayServer.extraEnv[0].name=DVARA_AUDIT_HMAC_SECRET" \
--set-string "gatewayServer.extraEnv[0].value=$AUDIT_HMAC" \
--set secrets.mockProviderEnabled=true

# Wait for pods to be ready
kubectl rollout status deployment/dvara-server
kubectl rollout status deployment/dvara-ui

# Verify
kubectl port-forward svc/dvara-server 8080:8080 &
curl http://localhost:8080/actuator/health

From Local Chart

Same secret arguments, pointing at a local chart directory:

helm install dvara charts/meridian/ \
--set secrets.enterpriseLicenseKey="$DVARA_LICENSE_KEY" \
--set secrets.gatewayServerApiKey="$ACTUATOR_API_KEY" \
--set secrets.gatewayMetricsApiKey="$METRICS_API_KEY" \
--set secrets.gatewayEncryptionMasterPassword="$ENCRYPTION_PASSWORD" \
--set-string "gatewayServer.extraEnv[0].name=DVARA_AUDIT_HMAC_SECRET" \
--set-string "gatewayServer.extraEnv[0].value=$AUDIT_HMAC" \
--set secrets.mockProviderEnabled=true

For any deployment past initial smoke-testing, move these into a values file (or an externally-managed Secret via secrets.existingSecret) rather than passing on the command line.

All four chart-managed secrets are required at boot

The gateway pod fails startup if enterpriseLicenseKey is unset (LicenseValidationException: No license key configured). The actuator chain returns 401 on every authenticated endpoint if the two API keys are unset. ENCRYPTED-mode credential persistence fails if the encryption master password is unset. The audit chain refuses to write on a production-class profile if the HMAC secret is unset or carries the default placeholder.

Installing with Provider Keys

helm install dvara charts/meridian/ \
--set secrets.providerKeys.openai=sk-... \
--set secrets.providerKeys.anthropic=sk-ant-...

Or use a values file:

# my-values.yaml
secrets:
providerKeys:
openai: sk-...
anthropic: sk-ant-...
gemini: AIza...
helm install dvara charts/meridian/ -f my-values.yaml

Configuration Reference

Gateway Server

ParameterDescriptionDefault
gatewayServer.enabledEnable gateway-servertrue
gatewayServer.replicaCountReplicas (ignored when HPA enabled)1
gatewayServer.image.repositoryImage repositoryghcr.io/dvarahq/dvara/dvara-llm-gateway
gatewayServer.image.tagImage tag (defaults to chart appVersion)""
gatewayServer.gatewayModeOperating mode: standalone or fullfull
gatewayServer.region.idRegion identity for multi-region deployments""
gatewayServer.region.nameHuman-readable region name""
gatewayServer.javaOptsJVM options""
gatewayServer.resources.requests.cpuCPU request250m
gatewayServer.resources.requests.memoryMemory request512Mi
gatewayServer.resources.limits.cpuCPU limit2
gatewayServer.resources.limits.memoryMemory limit1Gi
gatewayServer.service.typeService typeClusterIP
gatewayServer.service.portService port8080

Gateway UI (DVARA Flightdeck)

The Helm chart's gatewayUi.* parameter family configures the DVARA Flightdeck pod (Console + tenant Portal + Automation API). The parameter prefix is a holdover from when the product was simply called "Gateway UI"; the deployed image is ghcr.io/dvarahq/dvara/dvara-flightdeck.

ParameterDescriptionDefault
gatewayUi.enabledEnable gateway-uitrue
gatewayUi.replicaCountNumber of replicas1
gatewayUi.image.repositoryImage repositoryghcr.io/dvarahq/dvara/dvara-flightdeck
gatewayUi.gatewayServerUrlOverride auto-discovered server URL"" (auto)
gatewayUi.resources.requests.cpuCPU request100m
gatewayUi.resources.requests.memoryMemory request256Mi
gatewayUi.service.typeService typeClusterIP
gatewayUi.service.portService port8090

Gateway UI Health Probes:

ProbePathPurpose
Liveness/actuator/health/livenessBasic liveness check
Readiness/actuator/health/readinessIncludes controlPlane check (gateway-server connectivity)
Startup/actuator/health/livenessAllows JVM warmup before liveness kicks in

All four probe paths (/actuator/health, /actuator/health/liveness, /actuator/health/readiness, /actuator/info) are anonymous by design so k8s probes work without secrets. Do not set management.endpoint.health.show-details=always — the gateway refuses to start in that mode so anonymous callers cannot read per-indicator JSON like cache-cluster state or database pool internals. Leave the default when-authorized.

Secrets

ParameterDescriptionDefault
secrets.createCreate the Secret resourcetrue
secrets.existingSecretUse an existing Secret instead""
secrets.providerKeys.openaiOpenAI API key""
secrets.providerKeys.anthropicAnthropic API key""
secrets.providerKeys.geminiGemini API key""
secrets.providerKeys.awsAccessKeyIdAWS access key (Bedrock)""
secrets.providerKeys.awsSecretAccessKeyAWS secret key (Bedrock)""
secrets.ollamaEnabledEnable Ollama provider""
secrets.ollamaBaseUrlOllama base URL""
secrets.bedrockEnabledEnable Bedrock provider""
secrets.mockProviderEnabledEnable mock provider""
secrets.gatewayInternalSecretShared secret for /internal/*""
secrets.gatewayEncryptionMasterPasswordAES-256-GCM master password for ENC: values + ENCRYPTED-mode provider credentials""
secrets.gatewayServerApiKeyOperator Bearer for /actuator/gateway-status + every authenticated /actuator/* path EXCEPT prometheus. Generate with openssl rand -base64 32. Required — every authenticated actuator probe 401s without it.""
secrets.gatewayMetricsApiKeyDistinct Bearer for /actuator/prometheus only. Must differ from gatewayServerApiKey (principle of least privilege — a leaked scrape token must not unlock the license envelope). Required.""
secrets.enterpriseLicenseKeyDVARA license envelope (DVARA-… prefix, Ed25519-signed). Required at startup for every DVARA process (gateway-server, flightdeck, mcp-proxy-server) — LicenseEnvironmentPostProcessor refuses to boot without it; no operator-flippable bypass.""
Audit HMAC is not yet a chart-managed secret

DVARA_AUDIT_HMAC_SECRET signs audit-chain envelopes and is required on any production-class Spring profile, but the chart does not wire it through secrets.* yet. Pass it via gatewayServer.extraEnv (see the Quick Start example above) or through secrets.existingSecret with the right key.

MCP Proxy Server

ParameterDescriptionDefault
mcpProxyServer.enabledEnable MCP Proxy (requires enterprise license)false
mcpProxyServer.replicaCountReplicas (ignored when HPA enabled)2
mcpProxyServer.image.repositoryImage repositoryghcr.io/dvarahq/dvara/dvara-mcp-gateway
mcpProxyServer.image.tagImage tag (defaults to chart appVersion)""
mcpProxyServer.javaOptsJVM options""
mcpProxyServer.gatewayServerUrlOverride auto-discovered server URL"" (auto)
mcpProxyServer.resources.requests.cpuCPU request250m
mcpProxyServer.resources.requests.memoryMemory request512Mi
mcpProxyServer.resources.limits.cpuCPU limit2
mcpProxyServer.resources.limits.memoryMemory limit1Gi
mcpProxyServer.service.typeService type (internal-only)ClusterIP
mcpProxyServer.service.portService port8070
mcpProxyServer.autoscaling.enabledEnable HPAfalse
mcpProxyServer.autoscaling.minReplicasMinimum replicas2
mcpProxyServer.autoscaling.maxReplicasMaximum replicas20
mcpProxyServer.pdb.enabledEnable PDBfalse
mcpProxyServer.pdb.minAvailableMin available pods1
mcpProxyServer.networkPolicy.enabledEnable NetworkPolicy (deny external, allow gateway-server)false
mcpProxyServer.serviceMonitor.enabledEnable Prometheus ServiceMonitorfalse

The MCP Proxy is deployed as an internal-only ClusterIP service. It should be accessed by the gateway-server, not exposed externally. When networkPolicy.enabled is true, only pods matching the gateway-server selector labels can reach the MCP Proxy.

Ingress

ParameterDescriptionDefault
ingress.enabledEnable Ingressfalse
ingress.classNameIngress class (nginx, traefik, alb)""
ingress.annotationsIngress annotations{}
ingress.gatewayServer.hostsServer host/path rules[{host: gateway.example.com}]
ingress.gatewayServer.tlsServer TLS config[]
ingress.gatewayUi.hostsUI host/path rules[{host: admin.example.com}]
ingress.gatewayUi.tlsUI TLS config[]

Graceful Shutdown & Rolling Updates

ParameterDescriptionDefault
gatewayServer.terminationGracePeriodSecondsPod termination grace period (must exceed preStop + drain)45
gatewayServer.preStopSleepSecondsSleep before SIGTERM (endpoint de-registration propagation)5
gatewayServer.rollingUpdate.maxSurgeMax extra pods during rolling update1
gatewayServer.rollingUpdate.maxUnavailableMax unavailable pods during rolling update (0 = zero-downtime)0
gatewayServer.topologySpreadConstraintsTopology spread for cross-zone scheduling[]

The default configuration ensures zero-downtime rolling updates: maxSurge: 1 creates one new pod before terminating old ones, and maxUnavailable: 0 ensures at least N pods are always ready. The preStopSleepSeconds delay allows Kubernetes endpoint propagation to complete before the application receives SIGTERM and begins its 30-second graceful drain.

Autoscaling (HPA)

ParameterDescriptionDefault
gatewayServer.autoscaling.enabledEnable HPAfalse
gatewayServer.autoscaling.minReplicasMinimum replicas2
gatewayServer.autoscaling.maxReplicasMaximum replicas10
gatewayServer.autoscaling.targetCPUUtilizationPercentageCPU target70
gatewayServer.autoscaling.targetMemoryUtilizationPercentageMemory target80
gatewayServer.autoscaling.behavior.scaleUp.stabilizationWindowSecondsWait before scaling up30
gatewayServer.autoscaling.behavior.scaleDown.stabilizationWindowSecondsWait before scaling down300

The default HPA behavior scales up quickly (50% per minute after 30s stabilization) but scales down conservatively (25% per 2 minutes after 5-minute stabilization) to prevent flapping.

Pod Disruption Budget

ParameterDescriptionDefault
gatewayServer.pdb.enabledEnable PDBfalse
gatewayServer.pdb.minAvailableMin available pods1
gatewayServer.pdb.maxUnavailableMax unavailable pods""

Prometheus ServiceMonitor

ParameterDescriptionDefault
gatewayServer.serviceMonitor.enabledEnable (requires Prometheus Operator)false
gatewayServer.serviceMonitor.intervalScrape interval30s
gatewayServer.serviceMonitor.pathMetrics path/actuator/prometheus
gatewayServer.serviceMonitor.additionalLabelsLabels for monitor selection{}

When gatewayServer.serviceMonitor.enabled=true, configure the ServiceMonitor's bearerTokenFile (or the Helm equivalent) to point at a file containing the DVARA_ACTUATOR_METRICS_API_KEY value. /actuator/prometheus is authenticated — without the token, every scrape returns 401 and the time series goes dark. The metrics secret is intentionally distinct from DVARA_ACTUATOR_API_KEY so a leaked scrape token can't unlock the rich gateway status surface.

Clustering on Kubernetes

DVARA LLM Gateway instances share rate-limit counters and API key lookups across the fleet. On Kubernetes, pods must discover each other via a headless Service — multicast is unavailable in most clusters.

When KUBERNETES_NAMESPACE is set (the downward API auto-injects this), the gateway requires CACHE_SERVICE_NAME to point at a headless Service fronting the gateway pods. Without it, startup fails with:

Kubernetes clustering requires CACHE_SERVICE_NAME when
KUBERNETES_NAMESPACE is set. Without it, pods cannot form a cluster and
rate limit state will not be shared.

The official Helm chart wires both variables automatically when you deploy multiple gateway replicas — you don't need the manual YAML below unless you're bypassing the chart. Outside Kubernetes (local docker compose, bare metal), gateway instances auto-discover each other via multicast and CACHE_SERVICE_NAME is not required.

Headless Service pattern (reference)

apiVersion: v1
kind: Service
metadata:
name: dvara-server-cluster
labels:
app.kubernetes.io/name: dvara-server
spec:
clusterIP: None # headless — each pod gets a DNS A record
publishNotReadyAddresses: true
selector:
app.kubernetes.io/name: dvara-server
ports:
- name: cluster
port: 5701
targetPort: 5701

Deployment environment variables (reference)

apiVersion: apps/v1
kind: Deployment
metadata:
name: dvara-server
spec:
template:
spec:
containers:
- name: gateway-server
env:
- name: KUBERNETES_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: CACHE_SERVICE_NAME
value: dvara-server-cluster

Common Deployment Patterns

Minimal Mode (Testing)

PostgreSQL is required even for minimal deployments — there is no in-memory fallback. For quick testing, point the chart at an external Postgres or deploy a small Postgres StatefulSet alongside the gateway:

helm install dvara charts/meridian/ \
--set secrets.mockProviderEnabled=true \
--set gatewayServer.env.SPRING_DATASOURCE_URL=jdbc:postgresql://postgres:5432/dvara \
--set gatewayServer.env.SPRING_DATASOURCE_USERNAME=dvara \
--set gatewayServer.env.SPRING_DATASOURCE_PASSWORD=dvara

Production with OpenAI

# production-values.yaml
gatewayServer:
replicaCount: 3
gatewayMode: standalone
# Match heap to container limits below — leave ~25% of the limit for
# off-heap (Metaspace, native code, JIT, kernel buffers). With memory
# limit = 2Gi, -Xmx1500m is a safe upper bound; reserve more if you
# see Metaspace pressure in JFR.
javaOpts: "-Xms1g -Xmx1500m"
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: "2"
memory: 2Gi
terminationGracePeriodSeconds: 45
preStopSleepSeconds: 5
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
pdb:
enabled: true
minAvailable: 2

secrets:
providerKeys:
openai: sk-...

ingress:
enabled: true
className: nginx
gatewayServer:
hosts:
- host: gateway.mycompany.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: gateway-tls
hosts:
- gateway.mycompany.com
helm install dvara charts/meridian/ -f production-values.yaml

Enterprise with MCP Proxy

Deploy the gateway with the MCP Proxy for agent tool governance:

# enterprise-mcp-values.yaml
gatewayServer:
replicaCount: 3
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10

mcpProxyServer:
enabled: true
replicaCount: 2
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 20
networkPolicy:
enabled: true
pdb:
enabled: true

secrets:
providerKeys:
openai: sk-...
enterpriseLicenseKey: DVARA-... # signed DVARA license key
helm install dvara charts/meridian/ -f enterprise-mcp-values.yaml

Access the MCP Proxy via port-forward (internal-only, not exposed via Ingress):

kubectl port-forward svc/dvara-mcp-gateway 8070:8070
curl http://localhost:8070/actuator/health

Inline Gateway Configuration

Pass a gateway.yaml configuration directly via Helm values:

gatewayServer:
gatewayConfig:
routing:
default-strategy: round-robin
rate-limit:
enabled: true
per-key:
requests-per-minute: 100

This creates a ConfigMap mounted into the gateway pod and applied as Spring Boot externalized configuration.

Using External Secrets

If you manage secrets with External Secrets Operator, Sealed Secrets, or a vault:

secrets:
create: false
existingSecret: my-external-secret

The existing Secret must contain the same keys: openai-api-key, anthropic-api-key, gemini-api-key, aws-access-key-id, aws-secret-access-key, ollama-enabled, ollama-base-url, bedrock-enabled, mock-provider-enabled, gateway-internal-secret, gateway-encryption-master-password, enterprise-license-key (signed).

AWS Bedrock with IRSA

gatewayServer:
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/dvara-bedrock

secrets:
bedrockEnabled: "true"

Multi-Region Deployment

Deploy an instance in a specific region:

# us-east-values.yaml
gatewayServer:
gatewayMode: full
region:
id: us-east-1
name: US East

secrets:
providerKeys:
openai: sk-...
helm install dvara-us-east charts/meridian/ -f us-east-values.yaml

When region.id is set, the chart adds DVARA_REGION_ID and DVARA_REGION_NAME environment variables (which Spring relaxed-binds to the dvara.region.id / dvara.region.name properties on the RegionContext bean) and a meridian.ai/region pod label for topology-aware scheduling.

Chart-side drift

The chart's gateway-server/deployment.yaml currently emits these as DVARA_LLM_GATEWAY_REGION_ID / DVARA_LLM_GATEWAY_REGION_NAME — Spring relaxed binding doesn't match those names against the dvara.region.* property prefix, so the values are set but never read. Tracked as part of #868. If you're hitting region-aware routing surprises and you set region.id via the chart, pass DVARA_REGION_ID / DVARA_REGION_NAME through gatewayServer.extraEnv as a workaround until the chart is fixed.

Prometheus Monitoring

gatewayServer:
serviceMonitor:
enabled: true
interval: 15s
additionalLabels:
release: prometheus-stack

The ServiceMonitor scrapes /actuator/prometheus. Requires the Prometheus Operator CRDs to be installed and a bearerTokenFile (or Helm-managed equivalent) pointing at the DVARA_ACTUATOR_METRICS_API_KEY value — see the ServiceMonitor parameter table above.

Security

The chart applies security hardening by default:

  • Non-root execution — Pods run as UID 1001 (runAsNonRoot: true)
  • Read-only filesystemreadOnlyRootFilesystem: true with a /tmp emptyDir for JVM temp files
  • No privilege escalationallowPrivilegeEscalation: false
  • Capabilities dropped — All Linux capabilities dropped
  • No service account tokenautomountServiceAccountToken: false (no K8s API access needed)
  • Secret key refs optional — Pods start even if only some provider keys are configured

Upgrading

# From OCI registry
helm upgrade dvara oci://ghcr.io/dvarahq/dvara/charts/meridian -f my-values.yaml

# From local chart
helm upgrade dvara charts/meridian/ -f my-values.yaml

Pods automatically restart when secrets or ConfigMap content changes (via checksum annotations on the pod template).

Running Helm Tests

helm test dvara

This runs test pods that verify gateway-server, gateway-ui, and (if enabled) mcp-proxy-server services are reachable.

Uninstalling

helm uninstall dvara

Troubleshooting

Pods stuck in CrashLoopBackOff

Check logs for JVM startup errors:

kubectl logs deployment/dvara-server

Common causes:

  • Insufficient memory — increase resources.limits.memory
  • Missing secret keys — verify the Secret exists: kubectl get secret dvara -o yaml

Startup probe fails

The startup probe allows 60 seconds (5s initial + 12 retries x 5s) for JVM warmup. If your image is large or the node is slow, increase the startup probe:

gatewayServer:
startupProbe:
failureThreshold: 20

Services not reachable

# Check pod status
kubectl get pods -l app.kubernetes.io/component=gateway-server

# Check service endpoints
kubectl get endpoints dvara-server

# Port-forward to test directly
kubectl port-forward svc/dvara-server 8080:8080
curl http://localhost:8080/actuator/health

Providers not registering

Provider keys must be non-empty strings. Check the Secret:

kubectl get secret dvara -o jsonpath='{.data.openai-api-key}' | base64 -d

Empty string = provider disabled (this is expected for unused providers).