Skip to main content

Kubernetes Deployment (Helm)

Deploy Dvara on Kubernetes with a single helm install command using the official Helm chart.

Prerequisites

  • Kubernetes 1.28+
  • Helm 3.x
  • kubectl configured for your cluster

Quick Start

Pre-built images and the Helm chart are published to GitHub Container Registry:

# Install from OCI registry (uses pre-built GHCR images)
helm install dvara oci://ghcr.io/kdhrubo/dvara/charts/dvara \
--set secrets.mockProviderEnabled=true

# Wait for pods to be ready
kubectl rollout status deployment/dvara-server
kubectl rollout status deployment/dvara-ui

# Verify
kubectl port-forward svc/dvara-server 8080:8080 &
curl http://localhost:8080/status

From Local Chart

# Install from local chart directory
helm install dvara charts/dvara/ \
--set secrets.mockProviderEnabled=true

# Wait for pods to be ready
kubectl rollout status deployment/dvara-server
kubectl rollout status deployment/dvara-ui

# Verify
kubectl port-forward svc/dvara-server 8080:8080 &
curl http://localhost:8080/status

Installing with Provider Keys

helm install dvara charts/dvara/ \
--set secrets.providerKeys.openai=sk-... \
--set secrets.providerKeys.anthropic=sk-ant-...

Or use a values file:

# my-values.yaml
secrets:
providerKeys:
openai: sk-...
anthropic: sk-ant-...
gemini: AIza...
helm install dvara charts/dvara/ -f my-values.yaml

Configuration Reference

Gateway Server

ParameterDescriptionDefault
gatewayServer.enabledEnable gateway-servertrue
gatewayServer.replicaCountReplicas (ignored when HPA enabled)1
gatewayServer.image.repositoryImage repositoryghcr.io/kdhrubo/dvara/gateway-server
gatewayServer.image.tagImage tag (defaults to chart appVersion)""
gatewayServer.gatewayModeOperating mode: standalone or fullfull
gatewayServer.region.idRegion identity for multi-region deployments""
gatewayServer.region.nameHuman-readable region name""
gatewayServer.javaOptsJVM options""
gatewayServer.resources.requests.cpuCPU request250m
gatewayServer.resources.requests.memoryMemory request512Mi
gatewayServer.resources.limits.cpuCPU limit2
gatewayServer.resources.limits.memoryMemory limit1Gi
gatewayServer.service.typeService typeClusterIP
gatewayServer.service.portService port8080

Gateway UI

ParameterDescriptionDefault
gatewayUi.enabledEnable gateway-uitrue
gatewayUi.replicaCountNumber of replicas1
gatewayUi.image.repositoryImage repositoryghcr.io/kdhrubo/dvara/gateway-ui
gatewayUi.gatewayServerUrlOverride auto-discovered server URL"" (auto)
gatewayUi.resources.requests.cpuCPU request100m
gatewayUi.resources.requests.memoryMemory request256Mi
gatewayUi.service.typeService typeClusterIP
gatewayUi.service.portService port8090

Gateway UI Health Probes:

ProbePathPurpose
Liveness/actuator/health/livenessBasic liveness check
Readiness/actuator/health/readinessIncludes controlPlane check (gateway-server connectivity)
Startup/actuator/health/livenessAllows JVM warmup before liveness kicks in

Secrets

ParameterDescriptionDefault
secrets.createCreate the Secret resourcetrue
secrets.existingSecretUse an existing Secret instead""
secrets.providerKeys.openaiOpenAI API key""
secrets.providerKeys.anthropicAnthropic API key""
secrets.providerKeys.geminiGemini API key""
secrets.providerKeys.awsAccessKeyIdAWS access key (Bedrock)""
secrets.providerKeys.awsSecretAccessKeyAWS secret key (Bedrock)""
secrets.ollamaEnabledEnable Ollama provider""
secrets.ollamaBaseUrlOllama base URL""
secrets.bedrockEnabledEnable Bedrock provider""
secrets.mockProviderEnabledEnable mock provider""
secrets.gatewayInternalSecretShared secret for /internal/*""
secrets.gatewayEncryptionMasterPasswordMaster password for ENC: values""
secrets.enterpriseLicenseKeyEnterprise license JWT (signed, required for MCP proxy)""

MCP Proxy Server (Enterprise)

ParameterDescriptionDefault
mcpProxyServer.enabledEnable MCP proxy (requires enterprise license)false
mcpProxyServer.replicaCountReplicas (ignored when HPA enabled)2
mcpProxyServer.image.repositoryImage repositoryghcr.io/kdhrubo/dvara/mcp-proxy-server
mcpProxyServer.image.tagImage tag (defaults to chart appVersion)""
mcpProxyServer.javaOptsJVM options""
mcpProxyServer.gatewayServerUrlOverride auto-discovered server URL"" (auto)
mcpProxyServer.resources.requests.cpuCPU request250m
mcpProxyServer.resources.requests.memoryMemory request512Mi
mcpProxyServer.resources.limits.cpuCPU limit2
mcpProxyServer.resources.limits.memoryMemory limit1Gi
mcpProxyServer.service.typeService type (internal-only)ClusterIP
mcpProxyServer.service.portService port8070
mcpProxyServer.autoscaling.enabledEnable HPAfalse
mcpProxyServer.autoscaling.minReplicasMinimum replicas2
mcpProxyServer.autoscaling.maxReplicasMaximum replicas20
mcpProxyServer.pdb.enabledEnable PDBfalse
mcpProxyServer.pdb.minAvailableMin available pods1
mcpProxyServer.networkPolicy.enabledEnable NetworkPolicy (deny external, allow gateway-server)false
mcpProxyServer.serviceMonitor.enabledEnable Prometheus ServiceMonitorfalse

The MCP proxy is deployed as an internal-only ClusterIP service. It should be accessed by the gateway-server, not exposed externally. When networkPolicy.enabled is true, only pods matching the gateway-server selector labels can reach the MCP proxy.

Ingress

ParameterDescriptionDefault
ingress.enabledEnable Ingressfalse
ingress.classNameIngress class (nginx, traefik, alb)""
ingress.annotationsIngress annotations{}
ingress.gatewayServer.hostsServer host/path rules[{host: gateway.example.com}]
ingress.gatewayServer.tlsServer TLS config[]
ingress.gatewayUi.hostsUI host/path rules[{host: admin.example.com}]
ingress.gatewayUi.tlsUI TLS config[]

Graceful Shutdown & Rolling Updates

ParameterDescriptionDefault
gatewayServer.terminationGracePeriodSecondsPod termination grace period (must exceed preStop + drain)45
gatewayServer.preStopSleepSecondsSleep before SIGTERM (endpoint de-registration propagation)5
gatewayServer.rollingUpdate.maxSurgeMax extra pods during rolling update1
gatewayServer.rollingUpdate.maxUnavailableMax unavailable pods during rolling update (0 = zero-downtime)0
gatewayServer.topologySpreadConstraintsTopology spread for cross-zone scheduling[]

The default configuration ensures zero-downtime rolling updates: maxSurge: 1 creates one new pod before terminating old ones, and maxUnavailable: 0 ensures at least N pods are always ready. The preStopSleepSeconds delay allows Kubernetes endpoint propagation to complete before the application receives SIGTERM and begins its 30-second graceful drain.

Autoscaling (HPA)

ParameterDescriptionDefault
gatewayServer.autoscaling.enabledEnable HPAfalse
gatewayServer.autoscaling.minReplicasMinimum replicas2
gatewayServer.autoscaling.maxReplicasMaximum replicas10
gatewayServer.autoscaling.targetCPUUtilizationPercentageCPU target70
gatewayServer.autoscaling.targetMemoryUtilizationPercentageMemory target80
gatewayServer.autoscaling.behavior.scaleUp.stabilizationWindowSecondsWait before scaling up30
gatewayServer.autoscaling.behavior.scaleDown.stabilizationWindowSecondsWait before scaling down300

The default HPA behavior scales up quickly (50% per minute after 30s stabilization) but scales down conservatively (25% per 2 minutes after 5-minute stabilization) to prevent flapping.

Pod Disruption Budget

ParameterDescriptionDefault
gatewayServer.pdb.enabledEnable PDBfalse
gatewayServer.pdb.minAvailableMin available pods1
gatewayServer.pdb.maxUnavailableMax unavailable pods""

Prometheus ServiceMonitor

ParameterDescriptionDefault
gatewayServer.serviceMonitor.enabledEnable (requires Prometheus Operator)false
gatewayServer.serviceMonitor.intervalScrape interval30s
gatewayServer.serviceMonitor.pathMetrics path/actuator/prometheus
gatewayServer.serviceMonitor.additionalLabelsLabels for monitor selection{}

Common Deployment Patterns

Standalone Mode (Default)

No external dependencies. Uses in-memory storage and the mock provider for testing:

helm install dvara charts/dvara/ \
--set secrets.mockProviderEnabled=true

Production with OpenAI

# production-values.yaml
gatewayServer:
replicaCount: 3
gatewayMode: standalone
javaOpts: "-Xms512m -Xmx768m"
terminationGracePeriodSeconds: 45
preStopSleepSeconds: 5
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
pdb:
enabled: true
minAvailable: 2

secrets:
providerKeys:
openai: sk-...

ingress:
enabled: true
className: nginx
gatewayServer:
hosts:
- host: gateway.mycompany.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: gateway-tls
hosts:
- gateway.mycompany.com
helm install dvara charts/dvara/ -f production-values.yaml

Enterprise with MCP Proxy

Deploy the gateway with the MCP proxy for agent tool governance:

# enterprise-mcp-values.yaml
gatewayServer:
replicaCount: 3
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10

mcpProxyServer:
enabled: true
replicaCount: 2
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 20
networkPolicy:
enabled: true
pdb:
enabled: true

secrets:
providerKeys:
openai: sk-...
enterpriseLicenseKey: eyJhbGci... # signed JWT from license-generator
helm install dvara charts/dvara/ -f enterprise-mcp-values.yaml

Access the MCP proxy via port-forward (internal-only, not exposed via Ingress):

kubectl port-forward svc/dvara-mcp-proxy 8070:8070
curl http://localhost:8070/actuator/health

Inline Gateway Configuration

Pass a gateway.yaml configuration directly via Helm values:

gatewayServer:
gatewayConfig:
routing:
default-strategy: round-robin
rate-limiting:
global:
requests-per-second: 100

This creates a ConfigMap mounted at /etc/dvara/gateway.yaml and sets GATEWAY_BOOTSTRAP_FILE automatically.

Using External Secrets

If you manage secrets with External Secrets Operator, Sealed Secrets, or a vault:

secrets:
create: false
existingSecret: my-external-secret

The existing Secret must contain the same keys: openai-api-key, anthropic-api-key, gemini-api-key, aws-access-key-id, aws-secret-access-key, ollama-enabled, ollama-base-url, bedrock-enabled, mock-provider-enabled, gateway-internal-secret, gateway-encryption-master-password, enterprise-license-key (signed JWT).

AWS Bedrock with IRSA

gatewayServer:
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/dvara-bedrock

secrets:
bedrockEnabled: "true"

Multi-Region Deployment

Deploy an instance in a specific region:

# us-east-values.yaml
gatewayServer:
gatewayMode: full
region:
id: us-east-1
name: US East

secrets:
providerKeys:
openai: sk-...
helm install dvara-us-east charts/dvara/ -f us-east-values.yaml

When region.id is set, the chart adds GATEWAY_REGION_ID and GATEWAY_REGION_NAME environment variables and a dvara.ai/region pod label for topology-aware scheduling.

Prometheus Monitoring

gatewayServer:
serviceMonitor:
enabled: true
interval: 15s
additionalLabels:
release: prometheus-stack

The ServiceMonitor scrapes /actuator/prometheus. Requires the Prometheus Operator CRDs to be installed.

Security

The chart applies security hardening by default:

  • Non-root execution — Pods run as UID 1001 (runAsNonRoot: true)
  • Read-only filesystemreadOnlyRootFilesystem: true with a /tmp emptyDir for JVM temp files
  • No privilege escalationallowPrivilegeEscalation: false
  • Capabilities dropped — All Linux capabilities dropped
  • No service account tokenautomountServiceAccountToken: false (no K8s API access needed)
  • Secret key refs optional — Pods start even if only some provider keys are configured

Upgrading

# From OCI registry
helm upgrade dvara oci://ghcr.io/kdhrubo/dvara/charts/dvara -f my-values.yaml

# From local chart
helm upgrade dvara charts/dvara/ -f my-values.yaml

Pods automatically restart when secrets or ConfigMap content changes (via checksum annotations on the pod template).

Running Helm Tests

helm test dvara

This runs test pods that verify gateway-server, gateway-ui, and (if enabled) mcp-proxy-server services are reachable.

Uninstalling

helm uninstall dvara

Troubleshooting

Pods stuck in CrashLoopBackOff

Check logs for JVM startup errors:

kubectl logs deployment/dvara-server

Common causes:

  • Insufficient memory — increase resources.limits.memory
  • Missing secret keys — verify the Secret exists: kubectl get secret dvara -o yaml

Startup probe fails

The startup probe allows 60 seconds (5s initial + 12 retries x 5s) for JVM warmup. If your image is large or the node is slow, increase the startup probe:

gatewayServer:
startupProbe:
failureThreshold: 20

Services not reachable

# Check pod status
kubectl get pods -l app.kubernetes.io/component=gateway-server

# Check service endpoints
kubectl get endpoints dvara-server

# Port-forward to test directly
kubectl port-forward svc/dvara-server 8080:8080
curl http://localhost:8080/status

Providers not registering

Provider keys must be non-empty strings. Check the Secret:

kubectl get secret dvara -o jsonpath='{.data.openai-api-key}' | base64 -d

Empty string = provider disabled (this is expected for unused providers).