Multi-Tenancy
DVARA is a shared, multi-tenant AI governance platform. Every tenant runs on the same fleet of data-plane and control-plane nodes. Isolation is logical — scoped by tenant on every request — not physical. There is no schema per tenant, no pod per tenant, and no runtime sharding. This page explains what is isolated, what is shared, and how configuration composes.
The tenant model
A tenant has a name, a region, a status (ACTIVE or SUSPENDED), and a small key-value metadata map that tunes governance behaviour for that tenant alone (see Per-tenant configuration). Suspended tenants have their requests rejected immediately — audit events, costs, and token usage stay intact so you can still investigate or bill for prior activity.
Tenants don't own infrastructure. Every request resolves to a specific tenant before any downstream stage runs, and every read or write that follows is scoped to that tenant.
How a request is resolved to a tenant
Every data-plane request carries an Authorization: Bearer <key> header. The DVARA API key is an opaque random token — not a JWT, no claims, no parseable payload. The server stores only a SHA-256 hash of the key; the plaintext is shown once at creation and never again. Each key belongs to exactly one tenant, and hash collisions are rejected at the database, so the same token cannot exist as two separate keys.
When a request arrives:
- The gateway hashes the token and looks it up in a distributed cache shared across every node — cache hits resolve in sub-millisecond, with a 30-second TTL on cached entries (PostgreSQL is queried only on a miss). Revoking, deleting, or updating a key evicts the cache entry immediately, so stale keys are never accepted.
- The resolved tenant identity is attached to the request for everything downstream: rate limiting, policy evaluation, budget checks, audit events, cost attribution, and structured logs.
You do not need to run a separate cache service. The distributed cache is built in and auto-clusters across nodes. See Caching.
Revocation is a single database row flip, immediate across the fleet. A JWT stays valid until expiry unless every verifier consults a blacklist — which defeats the point of a self-contained token.
By default, the data plane accepts requests with no bearer token and treats them as anonymous — no tenant is attached. In a multi-tenant deployment, that creates a silent escape hatch from tenant-scoped policies, budgets, rate limits, cost attribution, and audit. Set DVARA_LLM_GATEWAY_REQUIRE_API_KEY=true so keyless requests are rejected with HTTP 401.
3-level configuration hierarchy
Many features cascade across three levels — global, tenant, and API key. Every applicable level is evaluated; the strictest or most specific level wins.
Global ← applies to every tenant
└─ Tenant ← applies to every key in one tenant
└─ API Key ← applies to one key in one tenant
- Budget caps stack at all three levels. A global "$5,000/month" plus a tenant "$500/month" plus an API key "$50/day" all apply, and the request is allowed only if it fits under every applicable cap.
- Policies have global and tenant scopes. Both run on every request; the strictest decision wins.
- Rate limits are keyed by API key, which transitively scopes them to the tenant.
Not every feature has all three levels. Where tenant-level config isn't supported, the global default applies to every tenant.
Per-tenant configuration
Per-tenant configuration is a small key-value map on each tenant. It overrides global defaults without a config reload.
Two ways to set it:
- DVARA Flightdeck — go to Tenants → Edit. Form tabs cover the Basic, PII, Guardrails, and Phileas keys. See Tenants & Users for the walkthrough.
- Automation API —
PUT /v1/admin/tenants/{id}. Intended for GitOps and IaC pipelines. Covers every key in the table below. The API merges into existing metadata — keys not in the request body are preserved.
Changes take effect on the next request. Keys are flat dot-separated strings; values are strings, booleans, numbers, or JSON-encoded maps.
| Key (or prefix) | Type | Feature |
|---|---|---|
priority-tier | premium / standard / bulk | Priority admission control |
pat.max-ttl-days | int (≤ 365) | Narrow the platform-wide personal access token TTL ceiling for this tenant |
pii.enabled, pii.action, pii.scan-responses, pii.scan-streaming-responses, pii.custom-patterns | various | PII Detection |
phileas.enabled-filters | CSV of filter names | Per-tenant Phileas PII filter restriction |
guardrail.enabled, guardrail.action, guardrail.risk-score-threshold | various | Guardrails |
guardrail.max-input-tokens, guardrail.max-messages-per-request, guardrail.max-message-length, guardrail.default-max-response-tokens | int | OWASP LLM10 input-size overrides |
guardrail.scan-streaming-responses | boolean | Streaming content filter toggle |
guardrail.content.* (profanity / violence / sexual / competitor / topic / custom denylist) | various | Per-category content filtering |
guardrail.injection.custom-patterns, guardrail.mcp-injection.enabled, guardrail.mcp-injection.action | various | Prompt injection and MCP tool-injection detection |
guardrail.context.warning-threshold-pct, guardrail.context.hard-threshold-pct, guardrail.context.pruning-strategy | int / enum | Context window management |
guardrail.plugins | JSON map | Per-tenant guardrail plugin overrides |
grounding.enabled, grounding.action, grounding.max-sources, grounding.max-source-length | various | Hallucination Detection |
cost.downgrade-threshold-pct, cost.downgrade-rules, cost.anomaly-threshold-pct | various | Model downgrade and FinOps anomaly sensitivity |
ip-access.allowlist, ip-access.denylist | CSV of CIDRs | Per-tenant IP access control (CIDR ranges) |
approval.required-tools, approval.required-servers, approval.timeout-seconds, approval.default-action | various | MCP approval gates |
agentic.loop-detection.* | various | Per-tenant loop detection tuning |
audit.store-prompts | boolean | Opt-in prompt storage for compliance |
curl -X PUT http://localhost:8090/v1/admin/tenants/acme \
-H "Content-Type: application/json" \
-d '{
"name": "Acme Corp",
"status": "ACTIVE",
"metadata": {
"priority-tier": "premium",
"pii.enabled": "true",
"pii.action": "REDACT",
"cost.downgrade-threshold-pct": "90"
}
}'
What is isolated and what is shared
Three scopes exist across the platform:
- Per-tenant — belongs to exactly one tenant; queries always filter by tenant
- Global + tenant — configuration can exist at either scope; tenant rows override the global default
- Shared — platform-wide, no tenant concept at all
| Resource | Scope |
|---|---|
| Audit events, cost records, token usage, compliance and chargeback reports | Per-tenant |
| API keys, users, webhooks | Per-tenant |
| MCP servers, tool calls, prompt templates and experiments | Per-tenant |
| Policies | Global + tenant (both evaluated) |
| Budget caps | Global + tenant + per-API-key (stacked) |
| Provider credentials | Global (platform-default) + tenant (BYOK) |
| Guardrail plugins | Global + tenant (tenant overrides) |
| Routes, model pricing, output schemas, semantic cache configs, eval and golden prompts, latency records | Shared |
Rule of thumb: if it affects billing, isolation, or audit, it's tenant-scoped; if it describes how the gateway itself behaves (routing, pricing, cache tuning), it's shared.
BYOK: provider credential model
Provider API keys (OpenAI, Anthropic, Azure, and so on) resolve through a four-step chain on every outbound call — the first step that returns a key wins:
- Tenant credential stored in DVARA Flightdeck for this specific tenant, active and scoped.
- Platform-default credential stored with no tenant assigned — a fallback for tenants that have not set up their own BYOK, and for dev / staging environments.
- Vault — HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault, if configured.
- Environment variable —
OPENAI_API_KEY,ANTHROPIC_API_KEY, and friends on the gateway container.
Every lookup is keyed by both the provider name and the tenant, so one tenant's cached credential is never served to another.
Each tenant can bring their own provider keys and be billed directly by the provider; rotations are scoped to one tenant; a compromised key doesn't affect anyone else.
Require tenant-scoped credentials
In a strict multi-tenant deployment, you usually don't want tenants to silently borrow the platform-default credential — provider billing, rate limits, and key blast radius all collapse onto a single upstream key.
Set the environment variable DVARA_CREDENTIALS_REQUIRE_TENANT_CREDENTIAL=true (or YAML property dvara.credentials.require-tenant-credential: true) and the resolution chain stops at step 1. A tenant with no own active credential receives HTTP 403 tenant_credential_required instead of falling through to the platform-default. Each rejection emits a PROVIDER_CREDENTIAL_MISSING audit event so compliance and FinOps can spot tenants that need to onboard their keys before traffic ramps.
For tiered SaaS — free tenants borrow the platform-default, paid tenants must BYOK — set a per-tenant override on the tenant row:
{
"metadata": {
"credentials.require-tenant-credential": false
}
}
The per-tenant value accepts a boolean, a number (0 / 1), or a YAML 1.1 truthy / falsy string (yes / no, on / off, 1 / 0, case-insensitive). Unrecognized values fall back to the global property and log a startup warning so a typo doesn't quietly disable enforcement.
Two operational notes worth flagging:
- Pair with
DVARA_LLM_GATEWAY_REQUIRE_API_KEY=true. Strict-BYOK only fires when the request arrives with a tenant identity. Keyless anonymous traffic carries no tenant attribute and bypasses enforcement; the gateway logs a startup warning when the two switches are out of sync. - Rejection happens late. A tenant under enforcement who lacks their own credential still consumes their per-key rate-limit window and budget for the failed call, since the gate runs inside the upstream HTTP path after policy and budget filters. Size rate-limit and budget defaults with a little headroom for onboarding rejections.
Storage modes
Each credential is stored one of two ways:
- Encrypted (default) — the key is encrypted at rest with AES-256-GCM. The encryption key is derived from
DVARA_ENCRYPTION_MASTER_PASSWORDusing PBKDF2-HMAC-SHA256, so the master password is required whenever any credential uses this mode. - Reference — only a vault pointer is stored in the database; the live secret never lands on the gateway at rest. Lookups go to HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault, and are cached in-process for 5 minutes by default (
dvara.vault.cache-ttl-seconds), so steady-state traffic does not hit the vault per request. A deployment using only reference credentials can omit the master password entirely.
Full walkthrough: Credentials & BYOK.
RBAC: platform vs tenant scope
When RBAC is enabled (on by default whenever authentication is enabled), DVARA splits roles into three platform roles and three tenant roles:
| Role | Scope | Example user |
|---|---|---|
owner | Platform — full access everywhere | Platform lead |
policy-admin | Platform — policies, routes, guardrails, MCP servers | Security engineer |
billing-admin | Platform — pricing, costs, budgets, chargeback, compliance reports | FinOps |
admin | Tenant — full access within own tenant + user management | Team lead |
developer | Tenant — create API keys, view usage, dry-run policies | Application developer |
viewer | Tenant — read-only access within own tenant | Auditor |
Mixing platform and tenant roles on one user is disallowed. A user is either platform staff (can see any tenant's data) or a tenant user (can only see their own tenant's). This keeps audit attribution clean — every admin action traces unambiguously to one side or the other, without the confused middle ground of a user who is sometimes acting across tenants and sometimes inside one. One shared validator (RoleMixValidator in gateway-core) enforces the rule at every sign-in surface: user creation + role-update (Admin REST + Console form), every OIDC JWT (the converter rejects mixed-role claim sets with HTTP 401 invalid_token), and every SAML assertion (the handler redirects to /login?error=saml_role_mix_invalid and refuses to seat the session). A misconfigured IdP cannot ship a mixed-role claim set into the security context.
Platform users land on DVARA Flightdeck at /; tenant users are auto-redirected to the DVARA Portal at /portal after login.
Admin API tenant scoping
Most admin list and summary endpoints accept an optional ?tenant_id= query parameter:
GET /v1/admin/budgets?tenant_id=acme
GET /v1/admin/policies?tenant_id=acme
GET /v1/admin/costs/summary?tenant_id=acme&from=2026-03-01&to=2026-03-31
The scope rule is enforced consistently across every admin list, summary, write, and path-variable endpoint that accepts a tenant id from the client:
- A platform
owner,policy-admin, orbilling-adminmay pass any tenant id, or omit it for a cross-tenant view. - A tenant
admin,developer, orviewermay only act on their own tenant. Omittingtenant_idnarrows the result to their own tenant. Passing a different tenant's id is rejected with HTTP 403 — the request is refused, not quietly narrowed, so the caller notices the misuse. ATENANT_SCOPE_VIOLATIONaudit event is written on every rejection so security teams can detect probing.
The same rule applies to write paths that carry tenantId in a request body or path variable. Enforcement happens at parameter binding — the controller never sees a tenant id the caller is not authorized to act on, regardless of whether the value arrived as a query parameter, a JSON field, or a URL segment.
# Tenant-admin PAT for tenant 'acme' tries to read globex's budgets
curl -H "Authorization: Bearer dvara_pat_..." \
"http://localhost:8090/v1/admin/budgets?tenant_id=globex"
# → HTTP 403
# {"error":{"type":"forbidden_error","code":"access_denied",
# "message":"Tenant-scoped role may not act on tenant 'globex' — caller is scoped to 'acme'."}}
# Same caller, no tenant_id — narrowed to their own tenant, returns 200
curl -H "Authorization: Bearer dvara_pat_..." \
"http://localhost:8090/v1/admin/budgets"
# → HTTP 200 with only acme's budgets
Managing tenants
Creating a tenant
curl -X POST http://localhost:8090/v1/admin/tenants \
-H "Authorization: Bearer <admin-pat>" \
-H "Content-Type: application/json" \
-d '{
"id": "acme",
"name": "Acme Corp",
"status": "ACTIVE",
"region": "us-east-1"
}'
Or interactively from DVARA Flightdeck under Tenants → New Tenant — see Tenants & Users for the UI walkthrough. After creation, issue at least one API key for the tenant so applications can start sending traffic.
Suspending vs deleting
- Suspend (reversible) — set
status: SUSPENDED. New requests are rejected with a clear error code, but audit events, cost records, and token usage stay intact. You can re-activate later without losing history. Right for free-trial expirations, billing disputes, and short-term offboarding. - Delete (irreversible) — see the surface table below for who can call which endpoint. The endpoint cascades: the per-tenant rows in 13 data tables (API keys, policies, budgets, webhooks, MCP servers, prompt templates / experiments, token usage, cost records, shadow policy events, chargeback / compliance reports, provider credentials, guardrail plugins) are cleared via
TenantDataPurger, then users + api_keys (the two FK-constrained identity tables) are cleared, then the tenant row. TheTENANT_DELETEDaudit event payload includesdata_rows_deleted/users_deleted/keys_deletedcounts so operators can confirm the cascade scope. Append-only audit tables (audit_events,signed_audit_envelopes,mcp_tool_calls) are retained by design — they referencetenantIdas a string column, not a FK, so they survive the delete as historical records you can still investigate or export from. Platform-global rows (tenant_id IS NULLinprovider_credentials,guardrail_plugins,policies,budget_caps) are never touched.
Two surfaces can delete a tenant. They differ on who can call them, when, and whether the deletion is recoverable:
| Surface | Available in | Role required | Behaviour |
|---|---|---|---|
DELETE /v1/admin/tenants/{id} (Admin REST / Console) | Self-managed + SaaS | Platform owner only | Immediate. Cascades through the data + identity tables described above. No recovery. |
POST /portal/account/delete (tenant portal) | SaaS only | Tenant admin only | Scheduled — stamps SUSPENDED + a selfDeletePurgeAt timestamp offboard.graceDays in the future. Blocked while a paid subscription is active (cancel billing first). Fully recoverable via POST /portal/account/delete/cancel any time before the grace window expires; after that, SelfDeleteReaper runs the same purge. |
Other platform roles (policy-admin, billing-admin) cannot delete a tenant on either surface. Other tenant roles (developer, viewer) cannot self-delete from the portal. Self-managed installs only expose the first surface — the portal self-delete page is gated behind @ConditionalOnSaasMode and returns 404 outside SaaS.
The SaaS self-delete path runs through FlightdeckAdminFacade.purgeTenant, which uses the same TenantDataPurger under the hood plus an additional PII-token purge for compliance-grade right-to-be-forgotten.
For permanent offboarding with a compliance requirement to purge PII, combine the admin DELETE with an explicit PII token purge (DELETE /v1/admin/pii/tokens/<tenantId>) — see PII Detection. The SaaS self-delete path already includes this purge.
Related
- DVARA Flightdeck — Tenants & Users — UI walkthrough for tenant CRUD and API key management
- DVARA Flightdeck — Credentials & BYOK — full credential management flow, including reference storage mode
- Configuration — property reference for every feature that has per-tenant overrides