Mock Provider
The Mock provider returns configurable fake completions without calling any upstream API. It's useful for integration tests, CI pipelines, load tests, and local development without real API keys.
Model prefix: mock/
1. Activate
Set a single env var — that's it:
export MOCK_PROVIDER_ENABLED=true
With only this, any request against a mock/* model returns the default fake response immediately. No YAML, no mount, no restart loop — just enable and go.
The Mock provider ships Groovy for scenario scripting, which means anyone who flips MOCK_PROVIDER_ENABLED=true on a production profile can execute arbitrary code inside the gateway JVM. The gateway fires a startup WARN whenever Mock is enabled outside the dev / test / ci / local / default Spring profile allow-list. Treat that warning as an alert, not a footnote — never run Mock on production traffic.
curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "mock/any", "messages": [{"role": "user", "content": "hi"}]}'
The response carries "content": "This is a mock response", finish_reason: stop, and a synthetic token count so your cost and usage dashboards still populate.
2. Simple tuning via env vars
Every scalar field on the Mock provider can be set from an environment variable — no config file required. Useful in CI, where the whole point is "don't ship a YAML file with the pipeline":
| Env var | Default | What it does |
|---|---|---|
MOCK_PROVIDER_ENABLED | false | Activates the Mock provider. Required. |
DVARA_LLM_GATEWAY_PROVIDERS_MOCK_RESPONSE | "This is a mock response" | Static text returned for every mock/* request that has no per-model override. |
DVARA_LLM_GATEWAY_PROVIDERS_MOCK_LATENCY_MS | 100 | Simulated end-to-end latency per call, in milliseconds. Set to 0 for CI. |
DVARA_LLM_GATEWAY_PROVIDERS_MOCK_STREAM_TOKEN_DELAY_MS | 20 | Inter-token delay when the client requests streaming. Set to 0 for CI. |
DVARA_LLM_GATEWAY_PROVIDERS_MOCK_ERROR_RATE | 0.0 | Probability in [0.0, 1.0] that a given request returns PROVIDER_ERROR. Used for testing client retry logic. |
Example — a tight CI profile with zero latency and 10% injected failures:
export MOCK_PROVIDER_ENABLED=true
export DVARA_LLM_GATEWAY_PROVIDERS_MOCK_RESPONSE="CI mock response"
export DVARA_LLM_GATEWAY_PROVIDERS_MOCK_LATENCY_MS=0
export DVARA_LLM_GATEWAY_PROVIDERS_MOCK_STREAM_TOKEN_DELAY_MS=0
export DVARA_LLM_GATEWAY_PROVIDERS_MOCK_ERROR_RATE=0.1
Error injection — error-rate: 0.1 means roughly 10% of requests return PROVIDER_ERROR. The Mock provider is excluded from circuit-breaker wrapping, so simulated errors never trip the breaker and the Mock provider stays available for the next test. That's the whole point: you get a fail-on-demand provider your test harness can rely on.
3. Wiremock-style conditional matchers
Real test scenarios rarely depend only on the model name. A "billing question" mock should respond when the user's message contains the word "billing"; a streaming smoke test should match when stream=true; an error-injection scenario should fire only on a specific tenant. Matching on the model alone cannot express any of this.
DVARA's Mock provider supports an ordered list of wiremock-style matchers, where each matcher has a Groovy predicate (when) evaluated against the incoming request and a response (response) that's either static text or a dynamic Groovy expression. Matchers run in declaration order; the first match wins; if nothing matches, the gateway falls through to the default response.
Mount a small config file with SPRING_CONFIG_ADDITIONAL_LOCATION and point the gateway at it. Create mock-override.yml next to your compose file:
dvara:
llm-gateway:
providers:
mock:
enabled: true
response: "Default fake response" # final fall-through
matchers:
# 1. Match on the user message text
- name: billing-question
when: "request.messages.last().textContent().toLowerCase().contains('billing')"
response: "Your current balance is $42.50. Anything else?"
# 2. Match on streaming flag
- name: streaming-smoke-test
when: "request.stream"
response: "This response will be streamed token by token"
# 3. Match on temperature with a Groovy-computed response
- name: high-temperature-creative
when: "request.temperature != null && request.temperature > 0.9"
response: "groovy: ['cats', 'dogs', 'birds'].shuffle().first()"
# 4. Inject errors for one specific model — handy for testing retry logic
- name: quota-exceeded-simulation
when: "request.model == 'mock/quota-exceeded'"
response: "groovy: throw new RuntimeException('quota exceeded')"
The request binding exposes every field of the incoming chat request:
request.model— the model string the client sent (e.g.mock/billing)request.messages— the list of messages;request.messages.last()is the most recentrequest.messages.last().textContent()— concatenated text content of every block in a message, joining text blocks with a space and skipping image blocks. Use this instead of walking the content list by hand.request.temperature— the temperature the caller sent (may benull)request.maxTokens— themax_tokensthe caller sent (may benull)request.stream— boolean,trueif the caller requested SSE streamingrequest.responseFormat— theresponse_formatobject, if anyrequest.metadata— the request metadata map (may benull). Commonly carriestenant_idfor tenant-keyed matchers — e.g.when: "request.metadata?.tenant_id == 'tenant-a'".
Matcher precedence is file-backed scenarios (step 4) → YAML matchers (this step) → response (default). The first matcher whose when returns true wins; once a matcher fires, lower-precedence matchers are skipped.
Failure modes:
- A matcher whose
whenpredicate has invalid Groovy syntax aborts gateway startup with a clear error pointing at the offending matcher's name and the parse error. - A matcher whose
whenpredicate throws at runtime (NPE, division by zero, an explicitthrowinside the script) returnsPROVIDER_ERRORto the caller with the exception message — testers see the failure rather than silently falling through. - A
response: "groovy: ..."script that throws follows the same pattern.
4. File-backed scenarios with hot reload
Inline YAML matchers are great for one-liners but painful for anything serious — multi-line Groovy is awkward in YAML strings, scenarios can't be unit-tested in isolation, and there's no IDE syntax highlighting. For that, mount a directory of .groovy scenario files alongside your config and point DVARA at it.
Step 1 — lay out the directory. Create a mock-scenarios/ folder next to your compose file with one file per scenario:
mock-scenarios/
├── 01-billing-question.groovy
├── 02-streaming-test.groovy
└── 03-error-injection.groovy
Files are loaded in lexicographic filename order, so prefix them with 01-, 02-, 03- to control which matcher wins when multiple predicates would match the same request.
Step 2 — write each scenario as a plain Groovy script that sets three top-level variables:
// mock-scenarios/01-billing-question.groovy
name = 'billing-question'
when = { request ->
request.messages.last().textContent().toLowerCase().contains('billing')
}
respond = { request ->
'Your current balance is $42.50. Anything else?'
}
name is a human-readable label used in logs, audit events, and Prometheus metrics. when is a closure that takes the request and returns true to match. respond is a closure that takes the request and returns the response body as a string. Both closures see the same request binding documented in step 3.
If name is missing, DVARA falls back to the filename without the .groovy extension (so 01-billing-question.groovy becomes 01-billing-question). Missing when or respond aborts startup with a clear error pointing at the offending file.
Step 3 — point DVARA at the directory. Add scenarios-dir to your mock-override.yml:
dvara:
llm-gateway:
providers:
mock:
enabled: true
scenarios-dir: /app/config/mock-scenarios
response: "No scenario matched"
Mount both the override file and the scenarios directory in your docker-compose.yml:
services:
dvara-llm-proxy:
image: ghcr.io/dvarahq/dvara/dvara-llm-gateway:latest
environment:
MOCK_PROVIDER_ENABLED: "true"
SPRING_CONFIG_ADDITIONAL_LOCATION: /app/config/mock-override.yml
volumes:
- ./mock-override.yml:/app/config/mock-override.yml:ro
- ./mock-scenarios:/app/config/mock-scenarios:ro
Hot reload — what actually happens. DVARA registers a filesystem watcher on the scenarios directory at startup. When you save a .groovy file (whether through vim, your IDE, or a git pull that updates the volume mount), the gateway:
- Receives the
CREATE/MODIFY/DELETEevent from the OS - Debounces rapid successive events for 200 ms (so an editor that does write-and-rename only triggers one reload)
- Re-scans the entire directory and parses every
.groovyfile - Atomically swaps the active scenario list on the Mock provider — in-flight requests continue using the old list; new requests see the new one
- Logs
Mock scenarios reloaded from /app/config/mock-scenarios — N scenario(s) activeat INFO level - Emits a
MOCK_SCENARIO_RELOADEDaudit event
If any file fails to compile during the rescan, the entire reload is abandoned, the previous matcher list stays active, and an error is logged with the broken file path. A bad save never takes the Mock provider offline — you fix the file and the next save retries.
5. Interactive editing in DVARA Flightdeck
For interactive scenario development — browser-based editor, compile-check on save, synchronous reload, a Run test panel — DVARA Flightdeck ships the Mock Scenario Editor. It reads and writes the same scenarios directory the filesystem watcher observes, so scenarios authored in the editor are immediately available for git-commit and promotion through your normal deploy pipeline.
The editor is a triple opt-in by design (mock.enabled + scenarios-dir + console-authoring=true, plus the owner role on every write endpoint). See Mock Scenario Editor for the enabling flags, the editor walk-through, and the security model.
Observability
Whenever the Mock provider serves traffic, the gateway exposes three Prometheus counters on the /actuator/prometheus endpoint (authenticated — see Observability → Health Endpoints for the bearer-token model):
| Counter | Labels | Meaning |
|---|---|---|
gateway_mock_matcher_fires_total | scenario, source (file or yaml) | Incremented every time a matcher's predicate returns true. Tells you which scenarios are firing and how often. |
gateway_mock_matcher_fallthrough_total | (none) | Incremented every time no matcher fires and the configured default response is returned. |
gateway_mock_scenarios_reloaded_total | — | Incremented on every successful reload of the scenarios directory. |
Compute the matcher hit rate on a Grafana dashboard with:
rate(gateway_mock_matcher_fires_total[5m])
/
(rate(gateway_mock_matcher_fires_total[5m]) + rate(gateway_mock_matcher_fallthrough_total[5m]))
Audit events on the same activity:
| Event | Trigger |
|---|---|
MOCK_MATCHER_FIRED | A matcher's predicate returned true (sampled at audit-sample-rate, default 1.0) |
MOCK_SCENARIO_RELOADED | Scenarios directory was successfully reloaded |
MOCK_SCENARIO_CREATED / MOCK_SCENARIO_UPDATED / MOCK_SCENARIO_DELETED | DVARA Flightdeck wrote, edited, or deleted a scenario file |
For high-throughput load tests where per-fire audit writes become a bottleneck, set dvara.llm-gateway.providers.mock.audit-sample-rate to a value below 1.0 (e.g. 0.1 for 10% sampling). The Prometheus counter still increments on every fire regardless of the sample rate, so the cumulative rate stays accurate.