Kubernetes Deployment

Deploy DVARA on Kubernetes with the official Helm chart.

Prerequisites

Kubernetes 1.28+
Helm 3.x
kubectl configured for your cluster
A reachable PostgreSQL 14+ instance (the chart does NOT bundle PostgreSQL — see Common Deployment Patterns below for how to wire it)
A DVARA license envelope (DVARA-… prefix). The gateway refuses to start without one; there is no operator-flippable bypass.

Quick Start

Mint the required secrets first — every DVARA install needs four chart-managed secrets plus the audit HMAC (passed via extraEnv because the chart doesn't wire it yet):

# 4 chart-managed secrets
export DVARA_LICENSE_KEY="DVARA-…"                                # from your trial / contract email
export ACTUATOR_API_KEY=$(openssl rand -base64 32)                # operator Bearer for /actuator/gateway-status
export METRICS_API_KEY=$(openssl rand -base64 32)                 # DISTINCT Bearer for /actuator/prometheus
export ENCRYPTION_PASSWORD=$(openssl rand -base64 32)             # AES-256-GCM key for ENCRYPTED-mode credentials
export AUDIT_HMAC=$(openssl rand -base64 32)                      # signs audit-chain envelopes

From OCI Registry (recommended)

Pre-built images and the Helm chart are published to GitHub Container Registry:

helm install dvara oci://ghcr.io/dvarahq/dvara/charts/meridian \
  --set secrets.enterpriseLicenseKey="$DVARA_LICENSE_KEY" \
  --set secrets.gatewayServerApiKey="$ACTUATOR_API_KEY" \
  --set secrets.gatewayMetricsApiKey="$METRICS_API_KEY" \
  --set secrets.gatewayEncryptionMasterPassword="$ENCRYPTION_PASSWORD" \
  --set-string "gatewayServer.extraEnv[0].name=DVARA_AUDIT_HMAC_SECRET" \
  --set-string "gatewayServer.extraEnv[0].value=$AUDIT_HMAC" \
  --set secrets.mockProviderEnabled=true

# Wait for pods to be ready
kubectl rollout status deployment/dvara-server
kubectl rollout status deployment/dvara-ui

# Verify
kubectl port-forward svc/dvara-server 8080:8080 &
curl http://localhost:8080/actuator/health

From Local Chart

Same secret arguments, pointing at a local chart directory:

helm install dvara charts/meridian/ \
  --set secrets.enterpriseLicenseKey="$DVARA_LICENSE_KEY" \
  --set secrets.gatewayServerApiKey="$ACTUATOR_API_KEY" \
  --set secrets.gatewayMetricsApiKey="$METRICS_API_KEY" \
  --set secrets.gatewayEncryptionMasterPassword="$ENCRYPTION_PASSWORD" \
  --set-string "gatewayServer.extraEnv[0].name=DVARA_AUDIT_HMAC_SECRET" \
  --set-string "gatewayServer.extraEnv[0].value=$AUDIT_HMAC" \
  --set secrets.mockProviderEnabled=true

For any deployment past initial smoke-testing, move these into a values file (or an externally-managed Secret via secrets.existingSecret) rather than passing on the command line.

All four chart-managed secrets are required at boot

The gateway pod fails startup if enterpriseLicenseKey is unset (LicenseValidationException: No license key configured). The actuator chain returns 401 on every authenticated endpoint if the two API keys are unset. ENCRYPTED-mode credential persistence fails if the encryption master password is unset. The audit chain refuses to write on a production-class profile if the HMAC secret is unset or carries the default placeholder.

Installing with Provider Keys

helm install dvara charts/meridian/ \
  --set secrets.providerKeys.openai=sk-... \
  --set secrets.providerKeys.anthropic=sk-ant-...

Or use a values file:

# my-values.yaml
secrets:
  providerKeys:
    openai: sk-...
    anthropic: sk-ant-...
    gemini: AIza...

helm install dvara charts/meridian/ -f my-values.yaml

Configuration Reference

Gateway Server

Parameter	Description	Default
`gatewayServer.enabled`	Enable gateway-server	`true`
`gatewayServer.replicaCount`	Replicas (ignored when HPA enabled)	`1`
`gatewayServer.image.repository`	Image repository	`ghcr.io/dvarahq/dvara/dvara-llm-gateway`
`gatewayServer.image.tag`	Image tag (defaults to chart appVersion)	`""`
`gatewayServer.gatewayMode`	Operating mode: `standalone` or `full`	`full`
`gatewayServer.region.id`	Region identity for multi-region deployments	`""`
`gatewayServer.region.name`	Human-readable region name	`""`
`gatewayServer.javaOpts`	JVM options	`""`
`gatewayServer.resources.requests.cpu`	CPU request	`250m`
`gatewayServer.resources.requests.memory`	Memory request	`512Mi`
`gatewayServer.resources.limits.cpu`	CPU limit	`2`
`gatewayServer.resources.limits.memory`	Memory limit	`1Gi`
`gatewayServer.service.type`	Service type	`ClusterIP`
`gatewayServer.service.port`	Service port	`8080`

Gateway UI (DVARA Flightdeck)

The Helm chart's gatewayUi.* parameter family configures the DVARA Flightdeck pod (Console + tenant Portal + Automation API). The parameter prefix is a holdover from when the product was simply called "Gateway UI"; the deployed image is ghcr.io/dvarahq/dvara/dvara-flightdeck.

Parameter	Description	Default
`gatewayUi.enabled`	Enable gateway-ui	`true`
`gatewayUi.replicaCount`	Number of replicas	`1`
`gatewayUi.image.repository`	Image repository	`ghcr.io/dvarahq/dvara/dvara-flightdeck`
`gatewayUi.gatewayServerUrl`	Override auto-discovered server URL	`""` (auto)
`gatewayUi.resources.requests.cpu`	CPU request	`100m`
`gatewayUi.resources.requests.memory`	Memory request	`256Mi`
`gatewayUi.service.type`	Service type	`ClusterIP`
`gatewayUi.service.port`	Service port	`8090`

Gateway UI Health Probes:

Probe	Path	Purpose
Liveness	`/actuator/health/liveness`	Basic liveness check
Readiness	`/actuator/health/readiness`	Includes `controlPlane` check (gateway-server connectivity)
Startup	`/actuator/health/liveness`	Allows JVM warmup before liveness kicks in

All four probe paths (/actuator/health, /actuator/health/liveness, /actuator/health/readiness, /actuator/info) are anonymous by design so k8s probes work without secrets. Do not set management.endpoint.health.show-details=always — the gateway refuses to start in that mode so anonymous callers cannot read per-indicator JSON like cache-cluster state or database pool internals. Leave the default when-authorized.

Secrets

Parameter	Description	Default
`secrets.create`	Create the Secret resource	`true`
`secrets.existingSecret`	Use an existing Secret instead	`""`
`secrets.providerKeys.openai`	OpenAI API key	`""`
`secrets.providerKeys.anthropic`	Anthropic API key	`""`
`secrets.providerKeys.gemini`	Gemini API key	`""`
`secrets.providerKeys.awsAccessKeyId`	AWS access key (Bedrock)	`""`
`secrets.providerKeys.awsSecretAccessKey`	AWS secret key (Bedrock)	`""`
`secrets.ollamaEnabled`	Enable Ollama provider	`""`
`secrets.ollamaBaseUrl`	Ollama base URL	`""`
`secrets.bedrockEnabled`	Enable Bedrock provider	`""`
`secrets.mockProviderEnabled`	Enable mock provider	`""`
`secrets.gatewayInternalSecret`	Shared secret for `/internal/*`	`""`
`secrets.gatewayEncryptionMasterPassword`	AES-256-GCM master password for `ENC:` values + ENCRYPTED-mode provider credentials	`""`
`secrets.gatewayServerApiKey`	Operator Bearer for `/actuator/gateway-status` + every authenticated `/actuator/*` path EXCEPT `prometheus`. Generate with `openssl rand -base64 32`. Required — every authenticated actuator probe `401`s without it.	`""`
`secrets.gatewayMetricsApiKey`	Distinct Bearer for `/actuator/prometheus` only. Must differ from `gatewayServerApiKey` (principle of least privilege — a leaked scrape token must not unlock the license envelope). Required.	`""`
`secrets.enterpriseLicenseKey`	DVARA license envelope (`DVARA-…` prefix, Ed25519-signed). Required at startup for every DVARA process (gateway-server, flightdeck, mcp-proxy-server) — `LicenseEnvironmentPostProcessor` refuses to boot without it; no operator-flippable bypass.	`""`

Audit HMAC is not yet a chart-managed secret

DVARA_AUDIT_HMAC_SECRET signs audit-chain envelopes and is required on any production-class Spring profile, but the chart does not wire it through secrets.* yet. Pass it via gatewayServer.extraEnv (see the Quick Start example above) or through secrets.existingSecret with the right key.

MCP Proxy Server

Parameter	Description	Default
`mcpProxyServer.enabled`	Enable MCP Proxy (requires enterprise license)	`false`
`mcpProxyServer.replicaCount`	Replicas (ignored when HPA enabled)	`2`
`mcpProxyServer.image.repository`	Image repository	`ghcr.io/dvarahq/dvara/dvara-mcp-gateway`
`mcpProxyServer.image.tag`	Image tag (defaults to chart appVersion)	`""`
`mcpProxyServer.javaOpts`	JVM options	`""`
`mcpProxyServer.gatewayServerUrl`	Override auto-discovered server URL	`""` (auto)
`mcpProxyServer.resources.requests.cpu`	CPU request	`250m`
`mcpProxyServer.resources.requests.memory`	Memory request	`512Mi`
`mcpProxyServer.resources.limits.cpu`	CPU limit	`2`
`mcpProxyServer.resources.limits.memory`	Memory limit	`1Gi`
`mcpProxyServer.service.type`	Service type (internal-only)	`ClusterIP`
`mcpProxyServer.service.port`	Service port	`8070`
`mcpProxyServer.autoscaling.enabled`	Enable HPA	`false`
`mcpProxyServer.autoscaling.minReplicas`	Minimum replicas	`2`
`mcpProxyServer.autoscaling.maxReplicas`	Maximum replicas	`20`
`mcpProxyServer.pdb.enabled`	Enable PDB	`false`
`mcpProxyServer.pdb.minAvailable`	Min available pods	`1`
`mcpProxyServer.networkPolicy.enabled`	Enable NetworkPolicy (deny external, allow gateway-server)	`false`
`mcpProxyServer.serviceMonitor.enabled`	Enable Prometheus ServiceMonitor	`false`

The MCP Proxy is deployed as an internal-only ClusterIP service. It should be accessed by the gateway-server, not exposed externally. When networkPolicy.enabled is true, only pods matching the gateway-server selector labels can reach the MCP Proxy.

Ingress

Parameter	Description	Default
`ingress.enabled`	Enable Ingress	`false`
`ingress.className`	Ingress class (nginx, traefik, alb)	`""`
`ingress.annotations`	Ingress annotations	`{}`
`ingress.gatewayServer.hosts`	Server host/path rules	`[{host: gateway.example.com}]`
`ingress.gatewayServer.tls`	Server TLS config	`[]`
`ingress.gatewayUi.hosts`	UI host/path rules	`[{host: admin.example.com}]`
`ingress.gatewayUi.tls`	UI TLS config	`[]`

Graceful Shutdown & Rolling Updates

Parameter	Description	Default
`gatewayServer.terminationGracePeriodSeconds`	Pod termination grace period (must exceed preStop + drain)	`45`
`gatewayServer.preStopSleepSeconds`	Sleep before SIGTERM (endpoint de-registration propagation)	`5`
`gatewayServer.rollingUpdate.maxSurge`	Max extra pods during rolling update	`1`
`gatewayServer.rollingUpdate.maxUnavailable`	Max unavailable pods during rolling update (0 = zero-downtime)	`0`
`gatewayServer.topologySpreadConstraints`	Topology spread for cross-zone scheduling	`[]`

The default configuration ensures zero-downtime rolling updates: maxSurge: 1 creates one new pod before terminating old ones, and maxUnavailable: 0 ensures at least N pods are always ready. The preStopSleepSeconds delay allows Kubernetes endpoint propagation to complete before the application receives SIGTERM and begins its 30-second graceful drain.

Autoscaling (HPA)

Parameter	Description	Default
`gatewayServer.autoscaling.enabled`	Enable HPA	`false`
`gatewayServer.autoscaling.minReplicas`	Minimum replicas	`2`
`gatewayServer.autoscaling.maxReplicas`	Maximum replicas	`10`
`gatewayServer.autoscaling.targetCPUUtilizationPercentage`	CPU target	`70`
`gatewayServer.autoscaling.targetMemoryUtilizationPercentage`	Memory target	`80`
`gatewayServer.autoscaling.behavior.scaleUp.stabilizationWindowSeconds`	Wait before scaling up	`30`
`gatewayServer.autoscaling.behavior.scaleDown.stabilizationWindowSeconds`	Wait before scaling down	`300`

The default HPA behavior scales up quickly (50% per minute after 30s stabilization) but scales down conservatively (25% per 2 minutes after 5-minute stabilization) to prevent flapping.

Pod Disruption Budget

Parameter	Description	Default
`gatewayServer.pdb.enabled`	Enable PDB	`false`
`gatewayServer.pdb.minAvailable`	Min available pods	`1`
`gatewayServer.pdb.maxUnavailable`	Max unavailable pods	`""`

Prometheus ServiceMonitor

Parameter	Description	Default
`gatewayServer.serviceMonitor.enabled`	Enable (requires Prometheus Operator)	`false`
`gatewayServer.serviceMonitor.interval`	Scrape interval	`30s`
`gatewayServer.serviceMonitor.path`	Metrics path	`/actuator/prometheus`
`gatewayServer.serviceMonitor.additionalLabels`	Labels for monitor selection	`{}`

When gatewayServer.serviceMonitor.enabled=true, configure the ServiceMonitor's bearerTokenFile (or the Helm equivalent) to point at a file containing the DVARA_ACTUATOR_METRICS_API_KEY value. /actuator/prometheus is authenticated — without the token, every scrape returns 401 and the time series goes dark. The metrics secret is intentionally distinct from DVARA_ACTUATOR_API_KEY so a leaked scrape token can't unlock the rich gateway status surface.

Clustering on Kubernetes

DVARA LLM Gateway instances share rate-limit counters and API key lookups across the fleet. On Kubernetes, pods must discover each other via a headless Service — multicast is unavailable in most clusters.

When KUBERNETES_NAMESPACE is set (the downward API auto-injects this), the gateway requires CACHE_SERVICE_NAME to point at a headless Service fronting the gateway pods. Without it, startup fails with:

Kubernetes clustering requires CACHE_SERVICE_NAME when
KUBERNETES_NAMESPACE is set. Without it, pods cannot form a cluster and
rate limit state will not be shared.

The official Helm chart wires both variables automatically when you deploy multiple gateway replicas — you don't need the manual YAML below unless you're bypassing the chart. Outside Kubernetes (local docker compose, bare metal), gateway instances auto-discover each other via multicast and CACHE_SERVICE_NAME is not required.

Headless Service pattern (reference)

apiVersion: v1
kind: Service
metadata:
  name: dvara-server-cluster
  labels:
    app.kubernetes.io/name: dvara-server
spec:
  clusterIP: None            # headless — each pod gets a DNS A record
  publishNotReadyAddresses: true
  selector:
    app.kubernetes.io/name: dvara-server
  ports:
    - name: cluster
      port: 5701
      targetPort: 5701

Deployment environment variables (reference)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dvara-server
spec:
  template:
    spec:
      containers:
        - name: gateway-server
          env:
            - name: KUBERNETES_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: CACHE_SERVICE_NAME
              value: dvara-server-cluster

Common Deployment Patterns

Minimal Mode (Testing)

PostgreSQL is required even for minimal deployments — there is no in-memory fallback. For quick testing, point the chart at an external Postgres or deploy a small Postgres StatefulSet alongside the gateway:

helm install dvara charts/meridian/ \
  --set secrets.mockProviderEnabled=true \
  --set gatewayServer.env.SPRING_DATASOURCE_URL=jdbc:postgresql://postgres:5432/dvara \
  --set gatewayServer.env.SPRING_DATASOURCE_USERNAME=dvara \
  --set gatewayServer.env.SPRING_DATASOURCE_PASSWORD=dvara

Production with OpenAI

# production-values.yaml
gatewayServer:
  replicaCount: 3
  gatewayMode: standalone
  # Match heap to container limits below — leave ~25% of the limit for
  # off-heap (Metaspace, native code, JIT, kernel buffers). With memory
  # limit = 2Gi, -Xmx1500m is a safe upper bound; reserve more if you
  # see Metaspace pressure in JFR.
  javaOpts: "-Xms1g -Xmx1500m"
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: "2"
      memory: 2Gi
  terminationGracePeriodSeconds: 45
  preStopSleepSeconds: 5
  rollingUpdate:
    maxSurge: 1
    maxUnavailable: 0
  autoscaling:
    enabled: true
    minReplicas: 3
    maxReplicas: 10
  pdb:
    enabled: true
    minAvailable: 2

secrets:
  providerKeys:
    openai: sk-...

ingress:
  enabled: true
  className: nginx
  gatewayServer:
    hosts:
      - host: gateway.mycompany.com
        paths:
          - path: /
            pathType: Prefix
    tls:
      - secretName: gateway-tls
        hosts:
          - gateway.mycompany.com

helm install dvara charts/meridian/ -f production-values.yaml

Enterprise with MCP Proxy

Deploy the gateway with the MCP Proxy for agent tool governance:

# enterprise-mcp-values.yaml
gatewayServer:
  replicaCount: 3
  autoscaling:
    enabled: true
    minReplicas: 3
    maxReplicas: 10

mcpProxyServer:
  enabled: true
  replicaCount: 2
  autoscaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 20
  networkPolicy:
    enabled: true
  pdb:
    enabled: true

secrets:
  providerKeys:
    openai: sk-...
  enterpriseLicenseKey: DVARA-...  # signed DVARA license key

helm install dvara charts/meridian/ -f enterprise-mcp-values.yaml

Access the MCP Proxy via port-forward (internal-only, not exposed via Ingress):

kubectl port-forward svc/dvara-mcp-gateway 8070:8070
curl http://localhost:8070/actuator/health

Inline Gateway Configuration

Pass a gateway.yaml configuration directly via Helm values:

gatewayServer:
  gatewayConfig:
    routing:
      default-strategy: round-robin
    rate-limit:
      enabled: true
      per-key:
        requests-per-minute: 100

This creates a ConfigMap mounted into the gateway pod and applied as Spring Boot externalized configuration.

Using External Secrets

If you manage secrets with External Secrets Operator, Sealed Secrets, or a vault:

secrets:
  create: false
  existingSecret: my-external-secret

The existing Secret must contain the same keys: openai-api-key, anthropic-api-key, gemini-api-key, aws-access-key-id, aws-secret-access-key, ollama-enabled, ollama-base-url, bedrock-enabled, mock-provider-enabled, gateway-internal-secret, gateway-encryption-master-password, enterprise-license-key (signed).

AWS Bedrock with IRSA

gatewayServer:
  serviceAccount:
    annotations:
      eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/dvara-bedrock

secrets:
  bedrockEnabled: "true"

Multi-Region Deployment

Deploy an instance in a specific region:

# us-east-values.yaml
gatewayServer:
  gatewayMode: full
  region:
    id: us-east-1
    name: US East

secrets:
  providerKeys:
    openai: sk-...

helm install dvara-us-east charts/meridian/ -f us-east-values.yaml

When region.id is set, the chart adds DVARA_REGION_ID and DVARA_REGION_NAME environment variables (which Spring relaxed-binds to the dvara.region.id / dvara.region.name properties on the RegionContext bean) and a meridian.ai/region pod label for topology-aware scheduling.

Chart-side drift

The chart's gateway-server/deployment.yaml currently emits these as DVARA_LLM_GATEWAY_REGION_ID / DVARA_LLM_GATEWAY_REGION_NAME — Spring relaxed binding doesn't match those names against the dvara.region.* property prefix, so the values are set but never read. Tracked as part of #868. If you're hitting region-aware routing surprises and you set region.id via the chart, pass DVARA_REGION_ID / DVARA_REGION_NAME through gatewayServer.extraEnv as a workaround until the chart is fixed.

Prometheus Monitoring

gatewayServer:
  serviceMonitor:
    enabled: true
    interval: 15s
    additionalLabels:
      release: prometheus-stack

The ServiceMonitor scrapes /actuator/prometheus. Requires the Prometheus Operator CRDs to be installed and a bearerTokenFile (or Helm-managed equivalent) pointing at the DVARA_ACTUATOR_METRICS_API_KEY value — see the ServiceMonitor parameter table above.

Security

The chart applies security hardening by default:

Non-root execution — Pods run as UID 1001 (runAsNonRoot: true)
Read-only filesystem — readOnlyRootFilesystem: true with a /tmp emptyDir for JVM temp files
No privilege escalation — allowPrivilegeEscalation: false
Capabilities dropped — All Linux capabilities dropped
No service account token — automountServiceAccountToken: false (no K8s API access needed)
Secret key refs optional — Pods start even if only some provider keys are configured

Upgrading

# From OCI registry
helm upgrade dvara oci://ghcr.io/dvarahq/dvara/charts/meridian -f my-values.yaml

# From local chart
helm upgrade dvara charts/meridian/ -f my-values.yaml

Pods automatically restart when secrets or ConfigMap content changes (via checksum annotations on the pod template).

Running Helm Tests

helm test dvara

This runs test pods that verify gateway-server, gateway-ui, and (if enabled) mcp-proxy-server services are reachable.

Uninstalling

helm uninstall dvara

Troubleshooting

Pods stuck in CrashLoopBackOff

Check logs for JVM startup errors:

kubectl logs deployment/dvara-server

Common causes:

Insufficient memory — increase resources.limits.memory
Missing secret keys — verify the Secret exists: kubectl get secret dvara -o yaml

Startup probe fails

The startup probe allows 60 seconds (5s initial + 12 retries x 5s) for JVM warmup. If your image is large or the node is slow, increase the startup probe:

gatewayServer:
  startupProbe:
    failureThreshold: 20

Services not reachable

# Check pod status
kubectl get pods -l app.kubernetes.io/component=gateway-server

# Check service endpoints
kubectl get endpoints dvara-server

# Port-forward to test directly
kubectl port-forward svc/dvara-server 8080:8080
curl http://localhost:8080/actuator/health

Providers not registering

Provider keys must be non-empty strings. Check the Secret:

kubectl get secret dvara -o jsonpath='{.data.openai-api-key}' | base64 -d

Empty string = provider disabled (this is expected for unused providers).

Prerequisites​

Quick Start​

From OCI Registry (recommended)​

From Local Chart​

Installing with Provider Keys​

Configuration Reference​

Gateway Server​

Gateway UI (DVARA Flightdeck)​

Secrets​

MCP Proxy Server​

Ingress​

Graceful Shutdown & Rolling Updates​

Autoscaling (HPA)​

Pod Disruption Budget​

Prometheus ServiceMonitor​

Clustering on Kubernetes​

Headless Service pattern (reference)​

Deployment environment variables (reference)​

Common Deployment Patterns​

Minimal Mode (Testing)​

Production with OpenAI​

Enterprise with MCP Proxy​

Inline Gateway Configuration​

Using External Secrets​

AWS Bedrock with IRSA​

Multi-Region Deployment​

Prometheus Monitoring​

Security​

Upgrading​

Running Helm Tests​

Uninstalling​

Troubleshooting​

Pods stuck in CrashLoopBackOff​

Startup probe fails​

Services not reachable​

Providers not registering​