Version: 1.3.0

Mock Provider

The Mock provider returns configurable fake completions without calling any upstream API. It's useful for integration tests, CI pipelines, load tests, and local development without real API keys.

Model prefix: mock/

1. Activate

Set a single env var — that's it:

export MOCK_PROVIDER_ENABLED=true

With only this, any request against a mock/* model returns the default fake response immediately. No YAML, no mount, no restart loop — just enable and go.

Dev / CI only — Mock bundles Groovy

The Mock provider ships Groovy for scenario scripting, which means anyone who flips MOCK_PROVIDER_ENABLED=true on a production profile can execute arbitrary code inside the gateway JVM. The gateway fires a startup WARN whenever Mock is enabled outside the dev / test / ci / local / default Spring profile allow-list. Treat that warning as an alert, not a footnote — never run Mock on production traffic.

curl -s -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "mock/any", "messages": [{"role": "user", "content": "hi"}]}'

The response carries "content": "This is a mock response", finish_reason: stop, and a synthetic token count so your cost and usage dashboards still populate.

2. Simple tuning via env vars

Every scalar field on the Mock provider can be set from an environment variable — no config file required. Useful in CI, where the whole point is "don't ship a YAML file with the pipeline":

Env var	Default	What it does
`MOCK_PROVIDER_ENABLED`	`false`	Activates the Mock provider. Required.
`DVARA_LLM_GATEWAY_PROVIDERS_MOCK_RESPONSE`	`"This is a mock response"`	Static text returned for every `mock/*` request that has no per-model override.
`DVARA_LLM_GATEWAY_PROVIDERS_MOCK_LATENCY_MS`	`100`	Simulated end-to-end latency per call, in milliseconds. Set to `0` for CI.
`DVARA_LLM_GATEWAY_PROVIDERS_MOCK_STREAM_TOKEN_DELAY_MS`	`20`	Inter-token delay when the client requests streaming. Set to `0` for CI.
`DVARA_LLM_GATEWAY_PROVIDERS_MOCK_ERROR_RATE`	`0.0`	Probability in `[0.0, 1.0]` that a given request returns `PROVIDER_ERROR`. Used for testing client retry logic.

Example — a tight CI profile with zero latency and 10% injected failures:

export MOCK_PROVIDER_ENABLED=true
export DVARA_LLM_GATEWAY_PROVIDERS_MOCK_RESPONSE="CI mock response"
export DVARA_LLM_GATEWAY_PROVIDERS_MOCK_LATENCY_MS=0
export DVARA_LLM_GATEWAY_PROVIDERS_MOCK_STREAM_TOKEN_DELAY_MS=0
export DVARA_LLM_GATEWAY_PROVIDERS_MOCK_ERROR_RATE=0.1

Error injection — error-rate: 0.1 means roughly 10% of requests return PROVIDER_ERROR. The Mock provider is excluded from circuit-breaker wrapping, so simulated errors never trip the breaker and the Mock provider stays available for the next test. That's the whole point: you get a fail-on-demand provider your test harness can rely on.

3. Wiremock-style conditional matchers

Real test scenarios rarely depend only on the model name. A "billing question" mock should respond when the user's message contains the word "billing"; a streaming smoke test should match when stream=true; an error-injection scenario should fire only on a specific tenant. Matching on the model alone cannot express any of this.

DVARA's Mock provider supports an ordered list of wiremock-style matchers, where each matcher has a Groovy predicate (when) evaluated against the incoming request and a response (response) that's either static text or a dynamic Groovy expression. Matchers run in declaration order; the first match wins; if nothing matches, the gateway falls through to the default response.

Mount a small config file with SPRING_CONFIG_ADDITIONAL_LOCATION and point the gateway at it. Create mock-override.yml next to your compose file:

dvara:
  llm-gateway:
    providers:
      mock:
        enabled: true
        response: "Default fake response"   # final fall-through

        matchers:
          # 1. Match on the user message text
          - name: billing-question
            when: "request.messages.last().textContent().toLowerCase().contains('billing')"
            response: "Your current balance is $42.50. Anything else?"

          # 2. Match on streaming flag
          - name: streaming-smoke-test
            when: "request.stream"
            response: "This response will be streamed token by token"

          # 3. Match on temperature with a Groovy-computed response
          - name: high-temperature-creative
            when: "request.temperature != null && request.temperature > 0.9"
            response: "groovy: ['cats', 'dogs', 'birds'].shuffle().first()"

          # 4. Inject errors for one specific model — handy for testing retry logic
          - name: quota-exceeded-simulation
            when: "request.model == 'mock/quota-exceeded'"
            response: "groovy: throw new RuntimeException('quota exceeded')"

The request binding exposes every field of the incoming chat request:

request.model — the model string the client sent (e.g. mock/billing)
request.messages — the list of messages; request.messages.last() is the most recent
request.messages.last().textContent() — concatenated text content of every block in a message, joining text blocks with a space and skipping image blocks. Use this instead of walking the content list by hand.
request.temperature — the temperature the caller sent (may be null)
request.maxTokens — the max_tokens the caller sent (may be null)
request.stream — boolean, true if the caller requested SSE streaming
request.responseFormat — the response_format object, if any
request.metadata — the request metadata map (may be null). Commonly carries tenant_id for tenant-keyed matchers — e.g. when: "request.metadata?.tenant_id == 'tenant-a'".

Matcher precedence is file-backed scenarios (step 4) → YAML matchers (this step) → response (default). The first matcher whose when returns true wins; once a matcher fires, lower-precedence matchers are skipped.

Failure modes:

A matcher whose when predicate has invalid Groovy syntax aborts gateway startup with a clear error pointing at the offending matcher's name and the parse error.
A matcher whose when predicate throws at runtime (NPE, division by zero, an explicit throw inside the script) returns PROVIDER_ERROR to the caller with the exception message — testers see the failure rather than silently falling through.
A response: "groovy: ..." script that throws follows the same pattern.

4. File-backed scenarios with hot reload

Inline YAML matchers are great for one-liners but painful for anything serious — multi-line Groovy is awkward in YAML strings, scenarios can't be unit-tested in isolation, and there's no IDE syntax highlighting. For that, mount a directory of .groovy scenario files alongside your config and point DVARA at it.

Step 1 — lay out the directory. Create a mock-scenarios/ folder next to your compose file with one file per scenario:

mock-scenarios/
├── 01-billing-question.groovy
├── 02-streaming-test.groovy
└── 03-error-injection.groovy

Files are loaded in lexicographic filename order, so prefix them with 01-, 02-, 03- to control which matcher wins when multiple predicates would match the same request.

Step 2 — write each scenario as a plain Groovy script that sets three top-level variables:

// mock-scenarios/01-billing-question.groovy

name = 'billing-question'

when = { request ->
    request.messages.last().textContent().toLowerCase().contains('billing')
}

respond = { request ->
    'Your current balance is $42.50. Anything else?'
}

name is a human-readable label used in logs, audit events, and Prometheus metrics. when is a closure that takes the request and returns true to match. respond is a closure that takes the request and returns the response body as a string. Both closures see the same request binding documented in step 3.

If name is missing, DVARA falls back to the filename without the .groovy extension (so 01-billing-question.groovy becomes 01-billing-question). Missing when or respond aborts startup with a clear error pointing at the offending file.

Step 3 — point DVARA at the directory. Add scenarios-dir to your mock-override.yml:

dvara:
  llm-gateway:
    providers:
      mock:
        enabled: true
        scenarios-dir: /app/config/mock-scenarios
        response: "No scenario matched"

Mount both the override file and the scenarios directory in your docker-compose.yml:

services:
  dvara-gateway:
    image: ghcr.io/dvarahq/dvara/dvara-llm-gateway:1.2.5
    environment:
      MOCK_PROVIDER_ENABLED: "true"
      SPRING_CONFIG_ADDITIONAL_LOCATION: /app/config/mock-override.yml
    volumes:
      - ./mock-override.yml:/app/config/mock-override.yml:ro
      - ./mock-scenarios:/app/config/mock-scenarios:ro

Hot reload — what actually happens. DVARA registers a filesystem watcher on the scenarios directory at startup. When you save a .groovy file (whether through vim, your IDE, or a git pull that updates the volume mount), the gateway:

Receives the CREATE/MODIFY/DELETE event from the OS
Debounces rapid successive events for 200 ms (so an editor that does write-and-rename only triggers one reload)
Re-scans the entire directory and parses every .groovy file
Atomically swaps the active scenario list on the Mock provider — in-flight requests continue using the old list; new requests see the new one
Logs Mock scenarios reloaded from /app/config/mock-scenarios — N scenario(s) active at INFO level
Emits a MOCK_SCENARIO_RELOADED audit event

If any file fails to compile during the rescan, the entire reload is abandoned, the previous matcher list stays active, and an error is logged with the broken file path. A bad save never takes the Mock provider offline — you fix the file and the next save retries.

5. Interactive editing in DVARA Flightdeck

For interactive scenario development — browser-based editor, compile-check on save, synchronous reload, a Run test panel — DVARA Flightdeck ships the Mock Scenario Editor. It reads and writes the same scenarios directory the filesystem watcher observes, so scenarios authored in the editor are immediately available for git-commit and promotion through your normal deploy pipeline.

The editor is a triple opt-in by design (mock.enabled + scenarios-dir + console-authoring=true, plus the owner role on every write endpoint). See Mock Scenario Editor for the enabling flags, the editor walk-through, and the security model.

Observability

Whenever the Mock provider serves traffic, the gateway exposes three Prometheus counters on the /actuator/prometheus endpoint (authenticated — see Observability → Health Endpoints for the bearer-token model):

Counter	Labels	Meaning
`gateway_mock_matcher_fires_total`	`scenario`, `source` (`file` or `yaml`)	Incremented every time a matcher's predicate returns `true`. Tells you which scenarios are firing and how often.
`gateway_mock_matcher_fallthrough_total`	(none)	Incremented every time no matcher fires and the configured default response is returned.
`gateway_mock_scenarios_reloaded_total`	—	Incremented on every successful reload of the scenarios directory.

Compute the matcher hit rate on a Grafana dashboard with:

rate(gateway_mock_matcher_fires_total[5m])
  /
(rate(gateway_mock_matcher_fires_total[5m]) + rate(gateway_mock_matcher_fallthrough_total[5m]))

Audit events on the same activity:

Event	Trigger
`MOCK_MATCHER_FIRED`	A matcher's predicate returned true (sampled at `audit-sample-rate`, default 1.0)
`MOCK_SCENARIO_RELOADED`	Scenarios directory was successfully reloaded
`MOCK_SCENARIO_CREATED` / `MOCK_SCENARIO_UPDATED` / `MOCK_SCENARIO_DELETED`	DVARA Flightdeck wrote, edited, or deleted a scenario file

For high-throughput load tests where per-fire audit writes become a bottleneck, set dvara.llm-gateway.providers.mock.audit-sample-rate to a value below 1.0 (e.g. 0.1 for 10% sampling). The Prometheus counter still increments on every fire regardless of the sample rate, so the cumulative rate stays accurate.

1. Activate​

2. Simple tuning via env vars​

3. Wiremock-style conditional matchers​

4. File-backed scenarios with hot reload​

5. Interactive editing in DVARA Flightdeck​

Observability​