API Reference

Base URL: http://localhost:8080

All endpoints return Content-Type: application/json and include an X-Trace-ID response header.

OpenAPI Specification

Machine-readable OpenAPI 3.x specs are generated at startup by SpringDoc.

Spec	URL	Contents
Public (JSON)	`GET /v3/api-docs/public`	All `/v1/`, `/admin/v1/`, and `/status` endpoints
Public (YAML)	`GET /v3/api-docs/public.yaml`	Same as above in YAML format
Internal (JSON)	`GET /v3/api-docs/internal`	Internal `/internal/**` endpoints only
Internal (YAML)	`GET /v3/api-docs/internal.yaml`	Same as above in YAML format

Use the spec to generate clients (OpenAPI Generator), mock servers, or detect breaking changes in CI.

Common Headers

Header	Direction	Description
`X-Trace-ID`	Request	Optional. If provided, echoed back in the response.
`X-Trace-ID`	Response	Always present. Inbound value or generated 32-character hex UUID.
`X-Cache`	Response	`HIT` or `MISS` (only on non-streaming requests when cache is enabled)
`X-Cache-Control`	Request	`no-cache` to bypass the response cache
`Authorization`	Request	`Bearer <api-key>` for rate limiting key identification
`X-Budget-Remaining-Tokens`	Response	Estimated tokens remaining in the active budget period. Omitted when no budget is configured. (Enterprise)
`X-Budget-Remaining-Pct`	Response	Percentage of budget remaining (0–100). Omitted when no budget is configured. (Enterprise)
`X-Budget-Warning`	Response	`true` when a `WARN_AGENT` policy rule has fired (budget utilization exceeded threshold). Omitted otherwise. (Enterprise)
`X-Context-Window-Warning`	Response	`true` when estimated tokens exceed the context window warning threshold. (Enterprise)
`X-Context-Window-Utilization`	Response	Context window utilization percentage (e.g., `87%`). Present when warning threshold breached. (Enterprise)
`X-Gateway-Strict-Downgraded`	Response	`true` when `json_schema` strict mode is downgraded (Anthropic/Bedrock).

POST /v1/chat/completions

OpenAI-compatible chat completions. Supports all configured providers.

Request

POST /v1/chat/completions
Content-Type: application/json

{
  "model": "gpt-4o",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "temperature": 0.7,
  "max_tokens": 512
}

Request Fields

Field	Type	Required	Description
`model`	string	yes	Model ID. Must match a configured provider's prefix.
`messages`	array	yes	Non-empty list of `{role, content}` objects.
`temperature`	number	no	Sampling temperature (0-2).
`max_tokens`	integer	no	Maximum tokens to generate.
`stream`	boolean	no	If `true`, returns Server-Sent Events stream.
`response_format`	object	no	Response format constraint. See Structured Outputs.
`tools`	array	no	Tool definitions (passed through to provider).
`tool_choice`	string/object	no	Tool choice control.
`top_p`	number	no	Nucleus sampling parameter.
`frequency_penalty`	number	no	Frequency penalty (-2.0 to 2.0).
`presence_penalty`	number	no	Presence penalty (-2.0 to 2.0).
`n`	integer	no	Number of completions to generate.
`user`	string	no	End-user identifier.

Message Object

Field	Type	Required	Description
`role`	string	yes	`system`, `user`, or `assistant`
`content`	string or array	yes	Text string or array of content parts (for multimodal)
`tool_calls`	array	no	Tool calls from assistant
`tool_call_id`	string	no	ID referencing a tool call

Response (200 OK)

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1771667816,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 9,
    "total_tokens": 33
  }
}

Streaming Response

When "stream": true, the response is sent as Server-Sent Events:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"gpt-4o","choices":[{"index":0,"delta":{"content":"The"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"gpt-4o","choices":[{"index":0,"delta":{"content":" capital"},"finish_reason":null}]}

data: [DONE]

POST /v1/embeddings

Returns embedding vectors for the given input. Requires an OpenAI-compatible embedding provider (OPENAI_API_KEY).

Request

POST /v1/embeddings
Content-Type: application/json

{
  "model": "text-embedding-ada-002",
  "input": "The quick brown fox"
}

Request Fields

Field	Type	Required	Description
`model`	string	yes	Embedding model ID (must start with `text-embedding`)
`input`	string or array	yes	Text to embed. String or array of strings.

Response (200 OK)

{
  "object": "list",
  "model": "text-embedding-ada-002",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0091, 0.0152, ...]
    }
  ],
  "usage": {
    "prompt_tokens": 5,
    "total_tokens": 5
  }
}

GET /v1/models

Lists all available models from registered providers, each with a capabilities object.

Request

GET /v1/models

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "id": "gpt-4o",
      "object": "model",
      "created": 1704067200,
      "owned_by": "openai",
      "capabilities": {
        "supports_streaming": true,
        "supports_vision": true,
        "supports_tool_calls": true,
        "supports_structured_outputs": true,
        "supports_json_mode": true,
        "max_context_tokens": 128000
      }
    },
    {
      "id": "claude-sonnet-4-5",
      "object": "model",
      "created": 1704067200,
      "owned_by": "anthropic",
      "capabilities": {
        "supports_streaming": true,
        "supports_vision": true,
        "supports_tool_calls": true,
        "supports_structured_outputs": true,
        "supports_json_mode": true,
        "max_context_tokens": 200000
      }
    }
  ]
}

Capability Fields

Field	Type	Description
`supports_streaming`	boolean	Provider supports SSE streaming
`supports_vision`	boolean	Provider supports image/vision inputs
`supports_tool_calls`	boolean	Provider supports function/tool calling
`supports_structured_outputs`	boolean	Provider supports structured output schemas
`supports_json_mode`	boolean	Provider supports JSON-mode responses
`max_context_tokens`	integer	Maximum context window size in tokens

POST /v1/completions (Legacy)

Legacy text-completion endpoint. Internally converts the prompt into a chat message and routes through the same pipeline.

Request

POST /v1/completions
Content-Type: application/json

{
  "model": "gpt-3.5-turbo-instruct",
  "prompt": "Say this is a test",
  "max_tokens": 64
}

Request Fields

Field	Type	Required	Description
`model`	string	yes	Model ID
`prompt`	string or array	yes	Text prompt
`max_tokens`	integer	no	Maximum tokens to generate
`temperature`	number	no	Sampling temperature

Response (200 OK)

{
  "id": "cmpl-xyz789",
  "object": "text_completion",
  "created": 1771667816,
  "model": "gpt-3.5-turbo-instruct",
  "choices": [
    {
      "text": " This is indeed a test.",
      "index": 0,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 6,
    "completion_tokens": 7,
    "total_tokens": 13
  }
}

GET /

Redirects to /try.

GET /

302 Found
Location: /try

GET /try

Built-in browser-based chat test panel. Provides a model selector, message input, SSE streaming, response metadata (model, provider, latency, token count), and a "Copy as curl" button.

GET /try

Returns 200 OK with Content-Type: text/html.

GET /status

Returns gateway status as JSON with provider, route, and configuration information.

GET /status
Accept: application/json

{
  "status": "running",
  "mode": "standalone",
  "version": "dev",
  "uptimeSeconds": 120,
  "configVersion": 0,
  "providers": [
    {
      "name": "mock",
      "type": "MockProvider",
      "health": "HEALTHY",
      "capabilities": {
        "streaming": true,
        "vision": false,
        "toolCalls": false,
        "structuredOutputs": true,
        "jsonMode": true,
        "maxContextTokens": 128000
      }
    }
  ],
  "routes": [],
  "region": {
    "id": null,
    "name": null,
    "regionAware": false
  },
  "rateLimits": {
    "enabled": false,
    "globalRequestsPerSecond": 100,
    "defaultPerKeyRequestsPerSecond": 10,
    "defaultPerKeyTokensPerMinute": 100000
  },
  "warnings": ["No routes configured (using default model-prefix routing)"]
}

GET /actuator/gateway-status

Custom Actuator endpoint returning the same structured status as /status. Exposed via Spring Boot Actuator at /actuator/gateway-status.

GET /actuator/gateway-status

Returns the same JSON structure as GET /status.

POST /admin/v1/tenants

Creates a new tenant.

Request

POST /admin/v1/tenants
Content-Type: application/json

{
  "name": "Acme Corp",
  "status": "active",
  "region": "us-east-1",
  "metadata": {"plan": "enterprise"}
}

Request Fields

Field	Type	Required	Description
`name`	string	yes	Tenant display name
`status`	string	no	`active` (default) or `suspended`
`region`	string	no	Deployment region (e.g., `us-east-1`)
`metadata`	object	no	Arbitrary key-value metadata

Response (201 Created)

Includes a Location header with the new tenant's URL.

Location: /admin/v1/tenants/{id}

{
  "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "name": "Acme Corp",
  "status": "active",
  "region": "us-east-1",
  "metadata": {"plan": "enterprise"},
  "created_at": "2026-01-15T10:30:00Z",
  "updated_at": "2026-01-15T10:30:00Z"
}

GET /admin/v1/tenants

Lists all tenants.

Request

GET /admin/v1/tenants

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "name": "Acme Corp",
      "status": "active",
      "region": "us-east-1",
      "metadata": {"plan": "enterprise"},
      "created_at": "2026-01-15T10:30:00Z",
      "updated_at": "2026-01-15T10:30:00Z"
    }
  ]
}

GET /admin/v1/tenants/{id}

Retrieves a single tenant by ID.

Request

GET /admin/v1/tenants/a1b2c3d4-e5f6-7890-abcd-ef1234567890

Response (200 OK)

{
  "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "name": "Acme Corp",
  "status": "active",
  "region": "us-east-1",
  "metadata": {"plan": "enterprise"},
  "created_at": "2026-01-15T10:30:00Z",
  "updated_at": "2026-01-15T10:30:00Z"
}

Response (404 Not Found)

Returned when the tenant ID does not exist.

PUT /admin/v1/tenants/{id}

Updates an existing tenant. Only provided fields are updated.

Request

PUT /admin/v1/tenants/a1b2c3d4-e5f6-7890-abcd-ef1234567890
Content-Type: application/json

{
  "name": "Acme Inc",
  "status": "suspended"
}

Request Fields

Field	Type	Required	Description
`name`	string	no	Updated tenant name
`status`	string	no	`active` or `suspended`
`region`	string	no	Updated region
`metadata`	object	no	Replacement metadata

Response (200 OK)

Returns the updated tenant with a refreshed updated_at timestamp.

Response (404 Not Found)

Returned when the tenant ID does not exist.

DELETE /admin/v1/tenants/{id}

Deletes a tenant.

Request

DELETE /admin/v1/tenants/a1b2c3d4-e5f6-7890-abcd-ef1234567890

Response (204 No Content)

Returned on successful deletion. No response body.

Response (404 Not Found)

Returned when the tenant ID does not exist.

POST /admin/v1/tenants/{tenantId}/keys

Creates a new API key for the specified tenant.

Request

POST /admin/v1/tenants/a1b2c3d4-e5f6-7890-abcd-ef1234567890/keys
Content-Type: application/json

{
  "name": "production-key",
  "scopes": ["completions:write"],
  "expires_at": "2027-01-01T00:00:00Z"
}

Request Fields

Field	Type	Required	Description
`name`	string	yes	Display name for the API key
`scopes`	array	no	Permission scopes. Defaults to `["completions:write"]`
`expires_at`	string	no	ISO-8601 expiration timestamp. `null` for non-expiring

Response (201 Created)

Includes a Location header with the new key's URL. The key field contains the plaintext API key — this is the only time it is returned.

Location: /admin/v1/tenants/{tenantId}/keys/{id}

{
  "id": "f7e6d5c4-b3a2-1098-7654-321fedcba098",
  "key": "gw_a3b4c5d6e7f8091011121314151617181920212223",
  "key_prefix": "gw_a3b4c5d6e",
  "name": "production-key",
  "scopes": ["completions:write"],
  "status": "active",
  "created_at": "2026-01-15T10:30:00Z"
}

Response (404 Not Found)

Returned when the tenant ID does not exist.

GET /admin/v1/tenants/{tenantId}/keys

Lists all API keys for a tenant. Plaintext keys are never returned.

Request

GET /admin/v1/tenants/a1b2c3d4-e5f6-7890-abcd-ef1234567890/keys

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "id": "f7e6d5c4-b3a2-1098-7654-321fedcba098",
      "name": "production-key",
      "key_prefix": "gw_a3b4c5d6e",
      "scopes": ["completions:write"],
      "status": "active",
      "expires_at": "2027-01-01T00:00:00Z",
      "created_at": "2026-01-15T10:30:00Z",
      "updated_at": "2026-01-15T10:30:00Z"
    }
  ]
}

GET /admin/v1/tenants/{tenantId}/keys/{keyId}

Retrieves a single API key by ID.

Request

GET /admin/v1/tenants/a1b2c3d4-e5f6-7890-abcd-ef1234567890/keys/f7e6d5c4-b3a2-1098-7654-321fedcba098

Response (200 OK)

{
  "id": "f7e6d5c4-b3a2-1098-7654-321fedcba098",
  "name": "production-key",
  "key_prefix": "gw_a3b4c5d6e",
  "scopes": ["completions:write"],
  "status": "active",
  "expires_at": "2027-01-01T00:00:00Z",
  "created_at": "2026-01-15T10:30:00Z",
  "updated_at": "2026-01-15T10:30:00Z"
}

Response (404 Not Found)

Returned when the tenant or key ID does not exist, or the key belongs to a different tenant.

DELETE /admin/v1/tenants/{tenantId}/keys/{keyId}

Revokes an API key (soft delete). The key status is set to revoked and it can no longer be used for authentication.

Request

DELETE /admin/v1/tenants/a1b2c3d4-e5f6-7890-abcd-ef1234567890/keys/f7e6d5c4-b3a2-1098-7654-321fedcba098

Response (204 No Content)

Returned on successful revocation. No response body.

Response (404 Not Found)

Returned when the tenant or key ID does not exist.

POST /admin/v1/tenants/{tenantId}/keys/{keyId}/rotate

Rotates an API key. The old key is marked as rotated (still valid during a grace period) and a new key is generated with the same name and scopes.

Request

POST /admin/v1/tenants/a1b2c3d4-e5f6-7890-abcd-ef1234567890/keys/f7e6d5c4-b3a2-1098-7654-321fedcba098/rotate

Response (201 Created)

Returns the new key with plaintext (same as create). The old key's status changes to rotated.

{
  "id": "new-key-uuid",
  "key": "gw_newplaintextkey...",
  "key_prefix": "gw_newplaint",
  "name": "production-key",
  "scopes": ["completions:write"],
  "status": "active",
  "created_at": "2026-02-01T12:00:00Z"
}

Response (404 Not Found)

Returned when the tenant or key ID does not exist.

GET /admin/v1/providers/{id}/capabilities

Returns the capabilities of a specific registered provider. Returns 404 if the provider is not configured.

Request

GET /admin/v1/providers/openai/capabilities

Response (200 OK)

{
  "supports_streaming": true,
  "supports_vision": true,
  "supports_tool_calls": true,
  "supports_structured_outputs": true,
  "supports_json_mode": true,
  "max_context_tokens": 128000
}

Response (404 Not Found)

Returned when the provider ID is unknown or not registered.

Valid provider IDs: openai, anthropic, gemini, bedrock, ollama, mock

POST /admin/v1/routes

Creates a new versioned route configuration. Changes propagate live without restart.

Request

POST /admin/v1/routes
Content-Type: application/json

{
  "model_pattern": "gpt*",
  "strategy": "model-prefix",
  "providers": [
    {"provider": "openai", "weight": 1}
  ],
  "pinned_model_version": null,
  "latency_sla_ms": 0
}

Request Fields

Field	Type	Required	Description
`model_pattern`	string	yes	Glob pattern to match model names (e.g., `gpt*`)
`strategy`	string	no	Routing strategy: `model-prefix`, `round-robin`, `weighted`. Default: `model-prefix`
`providers`	array	yes	List of provider entries with `provider`, `weight`, `model_override`, `region`
`pinned_model_version`	string	no	Pin all requests on this route to a specific model version
`latency_sla_ms`	long	no	Latency SLA in milliseconds for cost-aware routing. Default: `0` (disabled). When set, cost-aware routing selects the cheapest provider meeting this latency target

Response (201 Created)

{
  "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "model_pattern": "gpt*",
  "strategy": "model-prefix",
  "providers": [{"provider": "openai", "weight": 1}],
  "latency_sla_ms": 0,
  "version": 1,
  "created_at": "2026-01-01T00:00:00Z",
  "updated_at": "2026-01-01T00:00:00Z"
}

GET /admin/v1/routes

Lists all route configurations.

Request

GET /admin/v1/routes

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "model_pattern": "gpt*",
      "strategy": "model-prefix",
      "providers": [{"provider": "openai", "weight": 1}],
      "latency_sla_ms": 0,
      "version": 1,
      "created_at": "2026-01-01T00:00:00Z",
      "updated_at": "2026-01-01T00:00:00Z"
    }
  ]
}

GET /admin/v1/routes/{id}

Returns a specific route configuration.

Request

GET /admin/v1/routes/a1b2c3d4-e5f6-7890-abcd-ef1234567890

Response (200 OK)

Same format as the route object in the list response.

Response (404 Not Found)

Returned when the route ID does not exist.

PUT /admin/v1/routes/{id}

Updates a route configuration. Each update increments the version number. Changes propagate live.

Request

PUT /admin/v1/routes/a1b2c3d4-e5f6-7890-abcd-ef1234567890
Content-Type: application/json

{
  "model_pattern": "claude*",
  "strategy": "round-robin",
  "providers": [
    {"provider": "anthropic", "weight": 1},
    {"provider": "bedrock", "weight": 1}
  ]
}

All fields are optional — only provided fields are updated.

Response (200 OK)

Returns the updated route with incremented version.

DELETE /admin/v1/routes/{id}

Deletes a route and its version history. Changes propagate live.

Request

DELETE /admin/v1/routes/a1b2c3d4-e5f6-7890-abcd-ef1234567890

Response (204 No Content)

Response (404 Not Found)

Returned when the route ID does not exist.

GET /admin/v1/routes/{id}/versions

Returns the version history for a route (last 10 versions minimum).

Request

GET /admin/v1/routes/a1b2c3d4-e5f6-7890-abcd-ef1234567890/versions

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "route_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "version": 2,
      "model_pattern": "claude*",
      "strategy": "round-robin",
      "providers": [{"provider": "anthropic", "weight": 1}],
      "created_at": "2026-01-02T00:00:00Z"
    },
    {
      "route_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "version": 1,
      "model_pattern": "gpt*",
      "strategy": "model-prefix",
      "providers": [{"provider": "openai", "weight": 1}],
      "created_at": "2026-01-01T00:00:00Z"
    }
  ]
}

POST /admin/v1/routes/{id}/rollback

Rolls back a route to a specified version. Creates a new version with the rolled-back configuration. Changes propagate live.

Request

POST /admin/v1/routes/a1b2c3d4-e5f6-7890-abcd-ef1234567890/rollback
Content-Type: application/json

{
  "version": 1
}

Response (200 OK)

Returns the route with rolled-back configuration and incremented version.

Response (404 Not Found)

Returned when the route ID or the specified version does not exist.

GET /admin/v1/latency

Returns current EWMA latency for all tracked provider+model pairs. Available when the enterprise LatencyTracker is active; returns an empty list without license.

Request

GET /admin/v1/latency

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "provider": "openai",
      "model": "gpt-4o",
      "ewmaLatencyMs": 120.5,
      "rawLatencyMs": 115.0,
      "sampleCount": 42,
      "lastUpdated": "2026-01-01T00:00:00Z"
    },
    {
      "provider": "anthropic",
      "model": "claude-sonnet-4-5-20250514",
      "ewmaLatencyMs": 95.3,
      "rawLatencyMs": 88.0,
      "sampleCount": 128,
      "lastUpdated": "2026-01-01T00:01:00Z"
    }
  ]
}

Field	Type	Description
`provider`	string	Provider name
`model`	string	Model name
`ewmaLatencyMs`	double	Exponentially weighted moving average latency in milliseconds
`rawLatencyMs`	double	Most recent raw latency sample in milliseconds
`sampleCount`	long	Total number of latency samples recorded
`lastUpdated`	string	ISO-8601 timestamp of the last recorded sample

GET /admin/v1/priority/stats

Returns current priority admission control stats including per-tier concurrency and threshold configuration.

Request

GET /admin/v1/priority/stats

Response (200 OK)

{
  "object": "priority_stats",
  "data": {
    "currentConcurrent": 17,
    "maxConcurrent": 1000,
    "perTierConcurrent": {
      "premium": 5,
      "standard": 10,
      "bulk": 2
    },
    "perTierThresholdPct": {
      "premium": 100,
      "standard": 80,
      "bulk": 50
    }
  }
}

Field	Type	Description
`currentConcurrent`	int	Total in-flight requests across all tiers
`maxConcurrent`	int	Configured maximum concurrent requests (0 without license)
`perTierConcurrent`	object	Current in-flight count per priority tier
`perTierThresholdPct`	object	Configured throttle threshold percentage per tier

POST /admin/v1/policies

Creates a new policy in DRAFT status.

Request

POST /admin/v1/policies
Content-Type: application/json

{
  "name": "block-legacy-models",
  "tenant_id": "acme-corp",
  "description": "Block legacy models for production tenants",
  "dsl": "version: \"1\"\nrules:\n  - id: block-legacy\n    conditions:\n      model:\n        denylist: [gpt-3.5-turbo]\n    action: DENY\n    deny_message: \"Legacy model not allowed\""
}

Response (201 Created)

{
  "id": "pol-abc123",
  "tenant_id": "acme-corp",
  "name": "block-legacy-models",
  "description": "Block legacy models for production tenants",
  "dsl": "...",
  "status": "DRAFT",
  "version": 1,
  "created_at": "2026-01-15T10:00:00Z",
  "updated_at": "2026-01-15T10:00:00Z"
}

GET /admin/v1/policies

Lists all policies. Optional ?tenant_id= filter.

GET /admin/v1/policies/{id}

Returns a single policy by ID. Returns 404 if not found.

PUT /admin/v1/policies/{id}

Updates a policy. Each update increments the version. Only provided fields are changed.

DELETE /admin/v1/policies/{id}

Deletes a policy and its version history. Returns 204.

POST /admin/v1/policies/{id}/status

Changes policy status. Valid values: DRAFT, ACTIVE, SHADOW, ARCHIVED. When transitioning to ACTIVE, conflict detection runs against other active policies and returns warnings if conflicts are found.

Request

POST /admin/v1/policies/pol-abc123/status
Content-Type: application/json

{"status": "SHADOW"}

GET /admin/v1/policies/{id}/versions

Returns the version history for a policy (last 10 versions minimum).

POST /admin/v1/policies/{id}/rollback

Rolls back a policy to a specified version. Creates a new version with the restored configuration.

Request

{"version": 1}

POST /admin/v1/policies/dry-run

Simulates a policy evaluation. Validates YAML syntax and evaluates through the PolicyEngine.

Request

{
  "dsl": "version: \"1\"\nrules:\n  - id: test\n    conditions:\n      model:\n        denylist: [gpt-3.5-turbo]\n    action: DENY",
  "tenant_id": "acme-corp",
  "model": "gpt-3.5-turbo"
}

Response (200 OK)

{
  "status": "DENIED",
  "reason": "Request blocked by policy rule: test",
  "rule_id": "test",
  "evaluation_time_ms": 2
}

POST /admin/v1/policies/{id}/promote

Promotes a SHADOW policy to ACTIVE status. Runs conflict detection, cleans up shadow events, and publishes a policy mutation event.

Requires org-admin or policy-admin role.

Request

POST /admin/v1/policies/pol-abc123/promote

Response (200 OK)

{
  "id": "pol-abc123",
  "name": "block-legacy-models",
  "status": "ACTIVE",
  "version": 3,
  "warnings": ["Potential conflict with policy pol-xyz: overlapping model denylist"]
}

Response (400 Bad Request)

Returned when the policy is not in SHADOW status.

GET /admin/v1/policies/{id}/shadow/stats

Returns shadow mode divergence statistics for a policy.

Requires org-admin, policy-admin, developer, or viewer role.

Query Parameters

Parameter	Required	Default	Description
`period`	No	`24h`	Time window: `1h`, `24h`, or `7d`

Response (200 OK)

{
  "policy_id": "pol-abc123",
  "period": "24h",
  "total_evaluations": 1250,
  "divergent_count": 47,
  "divergence_rate": 0.0376,
  "by_rule": [
    {
      "rule_id": "block-legacy",
      "count": 35,
      "divergence_type": "shadow_deny_active_allow"
    },
    {
      "rule_id": "max-tokens-limit",
      "count": 12,
      "divergence_type": "shadow_deny_active_allow"
    }
  ]
}

GET /admin/v1/policies/{id}/shadow/events

Returns recent shadow policy events, optionally filtered to diverged-only.

Requires org-admin, policy-admin, developer, or viewer role.

Query Parameters

Parameter	Required	Default	Description
`diverged`	No	`true`	Filter to diverged events only
`limit`	No	`50`	Maximum events to return

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "id": "evt-uuid",
      "policy_id": "pol-abc123",
      "rule_id": "block-legacy",
      "request_id": "trace-xyz",
      "tenant_id": "acme-corp",
      "model": "gpt-3.5-turbo",
      "active_decision": "ALLOW",
      "shadow_decision": "DENY",
      "diverged": true,
      "divergence_type": "shadow_deny_active_allow",
      "timestamp": "2026-02-28T14:30:00Z"
    }
  ]
}

Internal Config Distribution API

Internal endpoints for data plane pods to pull configuration from the control plane without direct database access. Protected by the X-Internal-Secret header when gateway.internal.secret is configured.

GET /internal/v1/config/version

Returns the current config version (monotonic counter).

Request

GET /internal/v1/config/version
X-Internal-Secret: <secret>

Response (200 OK)

{
  "version": 42
}

GET /internal/v1/config/full

Returns a full config snapshot (tenants, routes, API keys without key hashes).

Request

GET /internal/v1/config/full?since_version=0
X-Internal-Secret: <secret>

Parameter	Type	Required	Description
`since_version`	integer	no	Advisory version hint (full snapshot always returned in core). Default: `0`

Response (200 OK)

{
  "version": 42,
  "tenants": [
    {
      "id": "a1b2c3d4-...",
      "name": "Acme Corp",
      "status": "active",
      "region": "us-east-1",
      "metadata": {"plan": "enterprise"},
      "created_at": "2026-01-15T10:30:00Z",
      "updated_at": "2026-01-15T10:30:00Z"
    }
  ],
  "routes": [
    {
      "id": "r1b2c3d4-...",
      "model_pattern": "gpt*",
      "strategy": "model-prefix",
      "providers": [{"provider": "openai", "weight": 1}],
      "version": 1,
      "created_at": "2026-01-01T00:00:00Z",
      "updated_at": "2026-01-01T00:00:00Z"
    }
  ],
  "api_keys": [
    {
      "id": "k1b2c3d4-...",
      "tenant_id": "a1b2c3d4-...",
      "name": "production-key",
      "key_prefix": "gw_a3b4c5d6e",
      "scopes": ["completions:write"],
      "status": "active",
      "expires_at": "2027-01-01T00:00:00Z",
      "created_at": "2026-01-15T10:30:00Z",
      "updated_at": "2026-01-15T10:30:00Z"
    }
  ]
}

Note: key_hash is deliberately omitted from API key entries for security.

Response (401 Unauthorized)

Returned when gateway.internal.secret is configured and the request is missing or has an invalid X-Internal-Secret header.

POST /admin/v1/pricing

Creates a new model pricing entry. Pricing entries define cost-per-million-tokens for input and output. Model names support glob patterns (e.g. gpt-4o* matches gpt-4o, gpt-4o-mini).

Request

POST /admin/v1/pricing
Content-Type: application/json

{
  "model": "gpt-4o",
  "provider": "openai",
  "inputPricePerMillion": 2.50,
  "outputPricePerMillion": 10.00
}

Request Fields

Field	Type	Required	Description
`model`	string	yes	Model name or glob pattern (e.g. `gpt-4o*`)
`provider`	string	yes	Provider name (e.g. `openai`, `anthropic`)
`inputPricePerMillion`	BigDecimal	yes	USD cost per 1M input tokens
`outputPricePerMillion`	BigDecimal	yes	USD cost per 1M output tokens

Response (201 Created)

{
  "id": "f47ac10b-...",
  "model": "gpt-4o",
  "provider": "openai",
  "inputPricePerMillion": 2.50,
  "outputPricePerMillion": 10.00,
  "effectiveDate": "2026-03-04T10:00:00Z",
  "createdAt": "2026-03-04T10:00:00Z",
  "updatedAt": "2026-03-04T10:00:00Z"
}

Includes Location: /admin/v1/pricing/\{id\} header.

GET /admin/v1/pricing

Lists all pricing entries. Optional ?model= and ?provider= filters.

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "id": "f47ac10b-...",
      "model": "gpt-4o",
      "provider": "openai",
      "inputPricePerMillion": 2.50,
      "outputPricePerMillion": 10.00,
      "effectiveDate": "2026-03-04T10:00:00Z",
      "createdAt": "2026-03-04T10:00:00Z",
      "updatedAt": "2026-03-04T10:00:00Z"
    }
  ]
}

GET /admin/v1/pricing/{id}

Returns a specific pricing entry. Returns 404 with pricing_not_found error if not found.

PUT /admin/v1/pricing/{id}

Updates a pricing entry. Returns 404 if not found. Increments ConfigVersionTracker.

DELETE /admin/v1/pricing/{id}

Deletes a pricing entry. Returns 204. Returns 404 if not found.

GET /admin/v1/costs

Lists cost records. Supports filtering by tenantId, apiKey, model, and provider query parameters.

Request

GET /admin/v1/costs?tenantId=acme-corp&model=gpt-4o

Response (200 OK)

[
  {
    "id": "cost-abc123",
    "tenantId": "acme-corp",
    "apiKey": "sk-test",
    "model": "gpt-4o",
    "provider": "openai",
    "inputTokens": 1000,
    "outputTokens": 500,
    "inputCost": 0.003,
    "outputCost": 0.0075,
    "totalCost": 0.0105,
    "currency": "USD",
    "pricingId": "f47ac10b-...",
    "timestamp": "2026-03-04T10:05:00Z"
  }
]

GET /admin/v1/costs/summary

Returns aggregated cost summary. Supports filtering by tenantId, model, provider, from, and to (ISO 8601 timestamps).

Request

GET /admin/v1/costs/summary?tenantId=acme-corp&model=gpt-4o

Response (200 OK)

{
  "tenantId": "acme-corp",
  "model": "gpt-4o",
  "provider": null,
  "totalInputCost": 0.15,
  "totalOutputCost": 0.375,
  "totalCost": 0.525,
  "requestCount": 50,
  "totalInputTokens": 50000,
  "totalOutputTokens": 25000
}

POST /admin/v1/budgets

Creates a budget cap with soft and hard limits.

Request

POST /admin/v1/budgets
Content-Type: application/json

{
  "name": "Production Monthly",
  "tenantId": "acme-corp",
  "apiKeyId": null,
  "period": "MONTHLY",
  "limitUsd": 1000.00,
  "softLimitPct": 80,
  "enabled": true
}

Response (201 Created)

{
  "id": "b-123",
  "tenantId": "acme-corp",
  "apiKeyId": null,
  "name": "Production Monthly",
  "period": "MONTHLY",
  "limitUsd": 1000.00,
  "softLimitPct": 80,
  "enabled": true,
  "version": 1,
  "createdAt": "2026-03-04T12:00:00Z",
  "updatedAt": "2026-03-04T12:00:00Z"
}

GET /admin/v1/budgets

Lists budget caps. Optional filters: tenantId, apiKeyId.

GET /admin/v1/budgets/{id}

Returns a specific budget cap. Returns 404 (BUDGET_NOT_FOUND) if not found.

PUT /admin/v1/budgets/{id}

Updates a budget cap. Increments version. Supports partial updates (name, period, limitUsd, softLimitPct, enabled).

DELETE /admin/v1/budgets/{id}

Deletes a budget cap. Returns 204 on success, 404 if not found.

GET /admin/v1/budgets/{id}/usage

Returns current period spend vs budget limit.

Response (200 OK)

{
  "budgetId": "b-123",
  "name": "Production Monthly",
  "period": "MONTHLY",
  "limitUsd": 1000.00,
  "softLimitUsd": 800.00,
  "currentSpend": 450.00,
  "remainingUsd": 550.00,
  "utilizationPct": 45.0,
  "periodStart": "2026-03-01T00:00:00Z",
  "periodEnd": "2026-04-01T00:00:00Z"
}

POST /admin/v1/chargeback

Generates a chargeback report for the specified time period. Requires enterprise license.

Request

{
  "tenant_id": "acme-corp",
  "from": "2026-02-01T00:00:00Z",
  "to": "2026-03-01T00:00:00Z"
}

Field	Type	Required	Description
`tenant_id`	string	no	Tenant to report on (null = all)
`from`	ISO 8601	yes	Start of reporting period
`to`	ISO 8601	yes	End of reporting period

Response (201 Created)

{
  "id": "cb-a1b2c3d4",
  "tenant_id": null,
  "from": "2026-02-01T00:00:00Z",
  "to": "2026-03-01T00:00:00Z",
  "generated_at": "2026-03-01T12:00:00Z",
  "generated_by": "admin",
  "sections": {
    "tenantSummary": [...],
    "apiKeySummary": [...],
    "modelSummary": [...],
    "providerSummary": [...],
    "dailyBreakdown": [...],
    "forecast": { "trailing7d": [...], "trailing30d": [...] },
    "anomalies": [...]
  }
}

Response (400 Bad Request)

Returned without license when enterprise license is not available (CHARGEBACK_NOT_AVAILABLE).

GET /admin/v1/chargeback

Lists all chargeback reports.

Query Parameters

Parameter	Type	Required	Description
`tenant_id`	string	no	Filter by tenant ID

Response (200 OK)

{
  "object": "list",
  "data": [...]
}

GET /admin/v1/chargeback/{id}

Returns a specific chargeback report (without PDF bytes).

Response (404 Not Found)

Returned when the report ID does not exist (CHARGEBACK_NOT_FOUND).

GET /admin/v1/chargeback/{id}/pdf

Downloads a chargeback report as PDF.

Response (200 OK)

Returns Content-Type: application/pdf with Content-Disposition: attachment; filename=chargeback-report-\{id\}.pdf.

GET /admin/v1/chargeback/{id}/csv

Downloads a chargeback report as CSV.

Response (200 OK)

Returns Content-Type: text/csv with Content-Disposition: attachment; filename=chargeback-report-\{id\}.csv.

DELETE /admin/v1/chargeback/{id}

Deletes a chargeback report. Returns 204.

POST /admin/v1/schemas

Creates an output schema configuration for automatic response JSON validation. Requires enterprise license.

Request

POST /admin/v1/schemas
Content-Type: application/json

{
  "modelPattern": "gpt-4o*",
  "routeId": "gpt-route",
  "schema": {
    "type": "object",
    "properties": {
      "name": { "type": "string" },
      "age": { "type": "integer" }
    },
    "required": ["name"]
  },
  "maxRetries": 2,
  "correctionPrompt": "The response must be valid JSON matching the schema. Fix the following errors:",
  "enabled": true
}

Response: 201 Created

GET /admin/v1/schemas

Lists all output schema configurations.

Response: 200 OK — Array of schema config objects.

GET /admin/v1/schemas/{id}

Gets a specific output schema configuration.

Response: 200 OK — Schema config object.

PUT /admin/v1/schemas/{id}

Updates an output schema configuration.

Response: 200 OK — Updated schema config object.

DELETE /admin/v1/schemas/{id}

Deletes an output schema configuration. Returns 204 No Content.

GET /admin/v1/costs/forecast

Returns cost forecasts based on trailing spend trends.

Query Parameters

Parameter	Type	Required	Description
`tenant_id`	string	no	Filter by tenant ID
`trailing_days`	int	no	Trailing window (default: 7)

Response (200 OK)

[
  {
    "tenantId": "acme-corp",
    "model": "gpt-4o",
    "trailingDays": 7,
    "dailyAverage": 12.50,
    "projectedMonthEnd": 375.00,
    "trend": "increasing",
    "computedAt": "2026-03-01T12:00:00Z"
  }
]

GET /admin/v1/costs/anomalies

Returns active cost anomalies where current daily spend rate exceeds the configured threshold relative to the 30-day baseline.

Query Parameters

Parameter	Type	Required	Description
`tenant_id`	string	no	Filter by tenant ID

Response (200 OK)

[
  {
    "tenantId": "acme-corp",
    "model": "gpt-4o",
    "currentDailyRate": 5.00,
    "baselineDailyRate": 1.00,
    "deviationPct": 500.0,
    "thresholdPct": 200.0,
    "detectedAt": "2026-03-01T12:00:00Z"
  }
]

POST /v1/budget/estimate

Pre-request budget estimation endpoint. Returns estimated cost for a request and remaining budget information. Useful for agent orchestrators to check budget before making a request.

Request

POST /v1/budget/estimate
Content-Type: application/json

{
  "model": "gpt-4o",
  "max_tokens": 1000
}

Field	Type	Required	Description
`model`	string	yes	Model name for cost estimation
`max_tokens`	integer	no	Max tokens for cost estimation

Response (200 OK)

{
  "estimated_cost_usd": 0.005,
  "budget_remaining_usd": 70.0,
  "budget_remaining_tokens": 500000,
  "budget_remaining_pct": 70,
  "would_exceed_budget": false
}

Field	Type	Nullable	Description
`estimated_cost_usd`	double	no	Estimated cost of the request in USD
`budget_remaining_usd`	double	yes	Remaining budget in USD (null when no budget configured)
`budget_remaining_tokens`	integer	yes	Estimated tokens remaining in budget (null when no budget)
`budget_remaining_pct`	integer	yes	Percentage of budget remaining (null when no budget)
`would_exceed_budget`	boolean	no	Whether this request would exceed the remaining budget

Error Responses

Status	Error Code	Description
400	`invalid_request_error`	Missing required `model` field

GET /admin/v1/audit/events

Lists audit events, optionally filtered by tenant, event type, and date range. Returns newest first.

Request

GET /admin/v1/audit/events?tenant_id=acme-corp&event_type=GATEWAY_RESPONSE&from=2026-01-01T00:00:00Z&to=2026-01-02T00:00:00Z

Query Parameters

Parameter	Type	Required	Description
`tenant_id`	string	no	Filter by tenant ID
`event_type`	string	no	Filter by event type (`GATEWAY_REQUEST`, `GATEWAY_RESPONSE`)
`from`	ISO 8601	no	Start of time range
`to`	ISO 8601	no	End of time range

Response (200 OK)

[
  {
    "eventId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "timestamp": "2026-01-15T10:30:00Z",
    "tenantId": "acme-corp",
    "eventType": "GATEWAY_RESPONSE",
    "payload": {
      "model": "gpt-4o",
      "provider": "openai",
      "status": 200,
      "latency_ms": 342,
      "api_key": "sk-prod-1...",
      "tokens_total": "235"
    }
  }
]

GET /admin/v1/audit/events/export

Downloads audit events as a CSV file. Accepts the same filter parameters as the list endpoint.

Request

GET /admin/v1/audit/events/export?tenant_id=acme-corp

Response (200 OK)

Returns Content-Type: text/csv with Content-Disposition: attachment; filename=audit-events.csv.

event_id,timestamp,tenant_id,event_type,payload
a1b2c3d4-...,2026-01-15T10:30:00Z,acme-corp,GATEWAY_RESPONSE,"{model=gpt-4o, status=200}"

GET /admin/v1/audit/events/export/json

Downloads audit events as a JSON file. Accepts the same filter parameters as the list endpoint.

Request

GET /admin/v1/audit/events/export/json?tenant_id=acme-corp

Response (200 OK)

Returns Content-Type: application/json with Content-Disposition: attachment; filename=audit-events.json.

[
  {
    "eventId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "timestamp": "2026-01-15T10:30:00Z",
    "tenantId": "acme-corp",
    "eventType": "GATEWAY_RESPONSE",
    "payload": {
      "model": "gpt-4o",
      "provider": "openai",
      "status": 200
    }
  }
]

POST /admin/v1/reports

Generates a compliance report (SOC2, HIPAA, or GDPR). Requires enterprise license.

Request

POST /admin/v1/reports
Content-Type: application/json

{
  "type": "SOC2",
  "tenant_id": "acme-corp",
  "from": "2026-02-01T00:00:00Z",
  "to": "2026-03-01T00:00:00Z"
}

Field	Type	Required	Description
`type`	string	yes	Report type: `SOC2`, `HIPAA`, or `GDPR`
`tenant_id`	string	no	Tenant ID to scope report (null = all tenants)
`from`	ISO 8601	yes	Start of reporting period
`to`	ISO 8601	yes	End of reporting period

Response (201 Created)

{
  "id": "rpt-a1b2c3d4",
  "type": "SOC2",
  "tenant_id": null,
  "from": "2026-02-01T00:00:00Z",
  "to": "2026-03-01T00:00:00Z",
  "generated_at": "2026-03-01T12:00:00Z",
  "generated_by": "admin@acme.com",
  "sections": {
    "auditChainIntegrity": { "valid": true, "verifiedCount": 1542 },
    "accessControlSummary": { "authSuccessCount": 980, "authFailureCount": 12 },
    "policyEnforcementSummary": { "policyDeniedCount": 5 },
    "tokenUsageSummary": [ ... ],
    "eventCountByType": { "GATEWAY_RESPONSE": 1200, "GATEWAY_REQUEST": 1200 }
  }
}

Error (400 — unlicensed mode)

{
  "error": {
    "message": "Compliance reports require enterprise license",
    "type": "invalid_request_error",
    "code": "compliance_not_available"
  }
}

GET /admin/v1/reports

Lists all generated compliance reports.

Request

GET /admin/v1/reports
GET /admin/v1/reports?tenant_id=acme-corp

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "id": "rpt-a1b2c3d4",
      "type": "SOC2",
      "tenant_id": null,
      "from": "2026-02-01T00:00:00Z",
      "to": "2026-03-01T00:00:00Z",
      "generated_at": "2026-03-01T12:00:00Z",
      "generated_by": "admin@acme.com"
    }
  ]
}

GET /admin/v1/reports/{id}

Returns a specific compliance report with full section data (without PDF bytes).

Request

GET /admin/v1/reports/rpt-a1b2c3d4

Response (200 OK)

Same structure as the POST response above.

Error (404)

{
  "error": {
    "message": "Report not found: rpt-unknown",
    "type": "not_found_error",
    "code": "report_not_found"
  }
}

GET /admin/v1/reports/{id}/pdf

Downloads the compliance report as a PDF file.

Request

GET /admin/v1/reports/rpt-a1b2c3d4/pdf

Response (200 OK)

Returns Content-Type: application/pdf with Content-Disposition: attachment; filename=soc2-report-rpt-a1b2c3d4.pdf.

If the stored report has no cached PDF bytes, the PDF is re-rendered on the fly.

DELETE /admin/v1/reports/{id}

Deletes a compliance report.

Request

DELETE /admin/v1/reports/rpt-a1b2c3d4

Response (204 No Content)

Empty body.

POST /admin/v1/pii/detokenize

Detokenizes PII tokens in a text string, replacing {{PII_<TYPE>_<hex>}} placeholders with the original values. Requires enterprise license.

Request

POST /admin/v1/pii/detokenize
Content-Type: application/json

{
  "text": "Contact {{PII_EMAIL_a1b2c3d4}} regarding invoice {{PII_CREDIT_CARD_e5f6a7b8}}",
  "tenant_id": "acme-corp"
}

Request Fields

Field	Type	Required	Description
`text`	string	yes	Text containing PII tokens to detokenize
`tenant_id`	string	yes	Tenant ID for token lookup

Response (200 OK)

{
  "text": "Contact user@example.com regarding invoice 4111111111111111"
}

Tokens that cannot be resolved (expired or unknown) are left as-is in the output.

RBAC

Requires org-admin or policy-admin role.

DELETE /admin/v1/pii/tokens/{tenantId}

Purges all stored PII tokens for a tenant. This is an irreversible operation — detokenization of previously redacted content will no longer be possible.

Request

DELETE /admin/v1/pii/tokens/acme-corp

Response (200 OK)

{
  "tenant_id": "acme-corp",
  "tokens_removed": 1542
}

RBAC

Requires org-admin role.

POST /admin/v1/webhooks

Creates a new webhook subscription. Webhooks receive HTTP POST callbacks when specified gateway events occur.

Request

POST /admin/v1/webhooks
Content-Type: application/json

{
  "name": "slack-alerts",
  "url": "https://hooks.slack.com/services/...",
  "secret": "whsec_my_signing_secret",
  "event_types": ["POLICY_DENIAL", "PII_DETECTED"],
  "tenant_id": "tenant-1",
  "description": "Slack alerts for governance events"
}

Request Fields

Field	Type	Required	Description
`name`	string	yes	Display name for the webhook
`url`	string	yes	HTTPS endpoint to receive webhook payloads
`secret`	string	yes	Signing secret for HMAC-SHA256 payload verification
`event_types`	array	yes	List of event types to subscribe to
`tenant_id`	string	no	Scope webhook to a specific tenant (null = all tenants)
`description`	string	no	Human-readable description

Supported Event Types

Event Type	Description
`POLICY_DENIAL`	A request was denied by a policy rule
`PII_DETECTED`	PII was detected in a request or response
`IP_ACCESS_DENIED`	A request was blocked by IP access control
`BUDGET_CAP_SOFT`	Soft budget cap threshold reached
`BUDGET_CAP_HARD`	Hard budget cap exceeded
`INJECTION_DETECTED`	Prompt injection detected
`AGENT_LOOP_DETECTED`	Agentic loop detected
`MCP_APPROVAL_REQUESTED`	MCP tool invocation requires human approval
`MCP_APPROVAL_TIMEOUT`	MCP approval request timed out

Response (201 Created)

The secret field is masked in all responses after creation.

{
  "id": "wh-abc123",
  "tenant_id": "tenant-1",
  "name": "slack-alerts",
  "description": "Slack alerts for governance events",
  "url": "https://hooks.slack.com/services/...",
  "secret": "whs***ret",
  "event_types": ["POLICY_DENIAL", "PII_DETECTED"],
  "status": "ACTIVE",
  "version": 1,
  "created_at": "2026-01-15T10:30:00Z",
  "updated_at": "2026-01-15T10:30:00Z"
}

GET /admin/v1/webhooks

Lists all webhook subscriptions, optionally filtered by tenant.

Request

GET /admin/v1/webhooks
GET /admin/v1/webhooks?tenant_id=tenant-1

Query Parameters

Parameter	Type	Required	Description
`tenant_id`	string	no	Filter webhooks by tenant ID

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "id": "wh-abc123",
      "tenant_id": "tenant-1",
      "name": "slack-alerts",
      "description": "Slack alerts for governance events",
      "url": "https://hooks.slack.com/services/...",
      "secret": "whs***ret",
      "event_types": ["POLICY_DENIAL", "PII_DETECTED"],
      "status": "ACTIVE",
      "version": 1,
      "created_at": "2026-01-15T10:30:00Z",
      "updated_at": "2026-01-15T10:30:00Z"
    }
  ]
}

GET /admin/v1/webhooks/{id}

Returns a specific webhook subscription.

Request

GET /admin/v1/webhooks/wh-abc123

Response (200 OK)

Same structure as the webhook object in the list response.

Response (404 Not Found)

{
  "error": {
    "message": "Webhook not found: wh-unknown",
    "type": "not_found_error",
    "code": "webhook_not_found"
  }
}

PUT /admin/v1/webhooks/{id}

Updates a webhook subscription. Each update increments the version number.

Request

PUT /admin/v1/webhooks/wh-abc123
Content-Type: application/json

{
  "name": "slack-alerts-v2",
  "event_types": ["POLICY_DENIAL", "PII_DETECTED", "BUDGET_CAP_HARD"]
}

All fields are optional -- only provided fields are updated.

Response (200 OK)

Returns the updated webhook with incremented version.

Response (404 Not Found)

Returned when the webhook ID does not exist.

DELETE /admin/v1/webhooks/{id}

Deletes a webhook subscription.

Request

DELETE /admin/v1/webhooks/wh-abc123

Response (204 No Content)

Empty body.

Response (404 Not Found)

Returned when the webhook ID does not exist.

POST /admin/v1/webhooks/{id}/test

Sends a test event to the webhook endpoint to verify connectivity and configuration.

Request

POST /admin/v1/webhooks/wh-abc123/test
Content-Type: application/json

{
  "event_type": "POLICY_DENIAL"
}

Request Fields

Field	Type	Required	Description
`event_type`	string	yes	Event type to simulate for the test

Response (200 OK)

{
  "success": true,
  "http_status": 200,
  "error_message": null
}

If the target endpoint is unreachable or returns a non-2xx status:

{
  "success": false,
  "http_status": 503,
  "error_message": "Connection refused"
}

Response (404 Not Found)

Returned when the webhook ID does not exist.

GET /admin/v1/webhooks/{id}/deliveries

Returns delivery logs for a webhook, showing the history of event dispatches and their outcomes.

Request

GET /admin/v1/webhooks/wh-abc123/deliveries

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "id": "dl-xyz789",
      "webhook_id": "wh-abc123",
      "event_id": "evt-456",
      "event_type": "POLICY_DENIED",
      "status": "SUCCESS",
      "http_status": 200,
      "attempt_count": 1,
      "error_message": null,
      "created_at": "2026-01-15T10:31:00Z",
      "last_attempt_at": "2026-01-15T10:31:00Z"
    }
  ]
}

Delivery Status Values

Status	Description
`SUCCESS`	Payload delivered and target returned 2xx
`FAILED`	All delivery attempts exhausted without success
`PENDING`	Delivery is queued or being retried

Response (404 Not Found)

Returned when the webhook ID does not exist.

POST /v1/webhooks/actions/{action}

Processes an approval action for MCP tool invocations requiring human approval. This endpoint is unauthenticated -- the token query parameter is self-authenticating via HMAC signature.

Request

POST /v1/webhooks/actions/approve?token=<signed_token>
POST /v1/webhooks/actions/deny?token=<signed_token>

Path Parameters

Parameter	Type	Required	Description
`action`	string	yes	Action to take: `approve` or `deny`

Query Parameters

Parameter	Type	Required	Description
`token`	string	yes	HMAC-signed token encoding the approval context

Response (200 OK)

Returns a confirmation of the action taken.

Response (400 Bad Request)

Returned when the token is invalid, expired, or the action is not recognized.

Webhook Delivery Payload

When a subscribed event occurs, the gateway sends an HTTP POST to the webhook URL with the following structure.

Delivery Headers

Header	Description
`Content-Type`	`application/json`
`X-Gateway-Signature`	`sha256=<hex_hmac>` -- HMAC-SHA256 of the payload body using the webhook secret
`X-Gateway-Event`	The audit event type (e.g., `POLICY_DENIAL`)
`X-Gateway-Delivery`	Unique delivery ID for idempotency tracking

Payload Structure

{
  "id": "<delivery-uuid>",
  "webhook_id": "<webhook-id>",
  "timestamp": "<ISO-8601>",
  "type": "<event_type>",
  "tenant_id": "<tenant-id-or-null>",
  "data": { ... }
}

The data field contains the event-specific payload from the audit event.

For MCP_APPROVAL_REQUESTED events, the payload includes two additional fields:

{
  "id": "<delivery-uuid>",
  "webhook_id": "<webhook-id>",
  "timestamp": "<ISO-8601>",
  "type": "MCP_APPROVAL_REQUESTED",
  "tenant_id": "<tenant-id>",
  "data": { ... },
  "approve_url": "https://gateway.example.com/v1/webhooks/actions/approve?token=<signed_token>",
  "deny_url": "https://gateway.example.com/v1/webhooks/actions/deny?token=<signed_token>"
}

Verifying Webhook Signatures

To verify the authenticity of a webhook delivery, compute the HMAC-SHA256 of the raw request body using your webhook secret and compare it to the value in the X-Gateway-Signature header:

expected = "sha256=" + hex(hmac_sha256(webhook_secret, request_body))
actual   = request.headers["X-Gateway-Signature"]
secure_compare(expected, actual)

Always use constant-time comparison to prevent timing attacks.

POST /admin/v1/mcp/servers

Registers a new MCP server in the gateway registry.

Request

POST /admin/v1/mcp/servers
Content-Type: application/json

{
  "server_id": "code-search",
  "tenant_id": "tenant-1",
  "transport": "HTTP_SSE",
  "url": "https://mcp.example.com/sse",
  "credential_ref": "vault://secret/mcp/code-search",
  "tags": ["search", "code"]
}

Request Fields

Field	Type	Required	Description
`server_id`	string	Yes	Human-readable slug, unique per tenant
`tenant_id`	string	Yes	Tenant that owns this server
`transport`	string	Yes	Transport protocol: `HTTP_SSE`
`url`	string	Yes	MCP server endpoint URL
`credential_ref`	string	No	Vault path for server credentials
`tags`	string[]	No	Tags for categorization

Response (201 Created)

Returns the created MCP server with a Location header pointing to the new resource.

{
  "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "server_id": "code-search",
  "tenant_id": "tenant-1",
  "transport": "HTTP_SSE",
  "url": "https://mcp.example.com/sse",
  "credential_ref": "vault://secret/mcp/code-search",
  "tags": ["search", "code"],
  "status": "ACTIVE",
  "tool_catalog": [],
  "version": 1,
  "created_at": "2026-01-15T10:00:00Z",
  "updated_at": "2026-01-15T10:00:00Z"
}

Response (409 Conflict)

Returned when server_id already exists for the given tenant.

GET /admin/v1/mcp/servers

Lists all registered MCP servers. Supports optional ?tenant_id= query parameter to filter by tenant.

Request

GET /admin/v1/mcp/servers
GET /admin/v1/mcp/servers?tenant_id=tenant-1

Response (200 OK)

{
  "object": "list",
  "data": [
    {
      "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "server_id": "code-search",
      "tenant_id": "tenant-1",
      "transport": "HTTP_SSE",
      "url": "https://mcp.example.com/sse",
      "status": "ACTIVE",
      "tool_catalog": [],
      "version": 1,
      "created_at": "2026-01-15T10:00:00Z",
      "updated_at": "2026-01-15T10:00:00Z"
    }
  ]
}

GET /admin/v1/mcp/servers/{id}

Returns a specific MCP server by its internal ID.

Request

GET /admin/v1/mcp/servers/a1b2c3d4-e5f6-7890-abcd-ef1234567890

Response (200 OK)

Same format as the MCP server object in the list response.

Response (404 Not Found)

Returned when the MCP server ID does not exist.

PUT /admin/v1/mcp/servers/{id}

Updates an MCP server. Only provided fields are updated. Each update increments the version number.

Request

PUT /admin/v1/mcp/servers/a1b2c3d4-e5f6-7890-abcd-ef1234567890
Content-Type: application/json

{
  "url": "https://mcp.new-url.com/sse",
  "status": "SUSPENDED"
}

Response (200 OK)

Returns the updated MCP server with incremented version.

Response (404 Not Found)

Returned when the MCP server ID does not exist.

DELETE /admin/v1/mcp/servers/{id}

Deletes an MCP server registration.

Request

DELETE /admin/v1/mcp/servers/a1b2c3d4-e5f6-7890-abcd-ef1234567890

Response (204 No Content)

Response (404 Not Found)

Returned when the MCP server ID does not exist.

GET /admin/v1/mcp/servers/{id}/health

Checks live connectivity to the MCP server.

Request

GET /admin/v1/mcp/servers/a1b2c3d4-e5f6-7890-abcd-ef1234567890/health

Response (200 OK)

{
  "reachable": true,
  "http_status": 200,
  "latency_ms": 42,
  "error_message": null,
  "checked_at": "2026-01-15T10:05:00Z"
}

note

Without a license (without enterprise license), health checks always return reachable: false with a message indicating that the feature requires an enterprise license.

POST /admin/v1/mcp/servers/{id}/tools/sync

Fetches the tool catalog from the MCP server and caches it locally.

Request

POST /admin/v1/mcp/servers/a1b2c3d4-e5f6-7890-abcd-ef1234567890/tools/sync

Response (200 OK)

{
  "tools_count": 2,
  "synced_at": "2026-01-15T10:05:00Z",
  "tools": [
    {
      "name": "search",
      "description": "Search code repositories"
    },
    {
      "name": "read_file",
      "description": "Read a file from the repository"
    }
  ]
}

Response (400 Bad Request)

Without a license, tool sync returns a 400 with error code mcp_not_available.

POST /admin/v1/mcp/servers/{id}/credentials/rotate

Rotates the credential reference for an MCP server. Evicts the old credential from cache, optionally sets a new credential reference, and validates the new credential is resolvable. Requires enterprise license and org-admin role.

Request

POST /admin/v1/mcp/servers/a1b2c3d4-e5f6-7890-abcd-ef1234567890/credentials/rotate
Content-Type: application/json

{
  "new_credential_ref": "vault://secret/mcp/new-path"
}

The new_credential_ref field is optional. If omitted, the existing credential reference is kept but the cache is evicted (forcing a fresh fetch from vault on the next request).

Response (200 OK)

{
  "success": true,
  "message": "Credential rotated successfully",
  "old_credential_ref": "vault://secret/mcp/old-path",
  "new_credential_ref": "vault://secret/mcp/new-path"
}

Response (400 Bad Request)

Without a license, returns 400 with error code mcp_not_available.

POST /admin/v1/mcp/servers/{id}/credentials/invalidate

Immediately evicts the cached credential for an MCP server. The next request will trigger a fresh fetch from the vault. Requires enterprise license and org-admin role.

Request

POST /admin/v1/mcp/servers/a1b2c3d4-e5f6-7890-abcd-ef1234567890/credentials/invalidate

Response (204 No Content)

No response body.

Response (404 Not Found)

Returns 404 with error code mcp_server_not_found if the server ID does not exist.

MCP Proxy Endpoints (port 8070)

The MCP Proxy is a standalone Spring Boot application (mcp-proxy-server) that runs on port 8070. It proxies requests from AI agents to registered MCP servers, injecting credentials from vault and applying a governance filter chain.

Enterprise license required

All MCP proxy endpoints require a valid signed JWT license key via GATEWAY_ENTERPRISE_LICENSE_KEY.

POST /mcp/{serverId}/tools/call

Invokes a tool on the specified MCP server. The proxy resolves the server from the registry, injects credentials, and forwards the request.

Request

POST /mcp/code-search/tools/call
Authorization: Bearer gw_<api-key>
Content-Type: application/json
X-Trace-Id: abc123
X-Session-Id: session-456

{
  "name": "search_code",
  "arguments": {
    "query": "authentication handler",
    "language": "java"
  }
}

Headers

Header	Required	Description
`Authorization`	Yes	Bearer token with a valid API key (same keys as LLM Gateway)
`X-Trace-Id`	No	Trace ID for distributed tracing; auto-generated if absent
`X-Session-Id`	No	Session ID for agent chain tracing; stored in audit context

Response (200 OK)

Returns the upstream MCP server response body as-is:

{
  "content": [
    {
      "type": "text",
      "text": "Found 3 matching files..."
    }
  ]
}

The response includes X-Trace-Id header echoed back.

Error Responses

Status	Code	Description
401	`mcp_auth_required`	Missing Bearer token
401	`mcp_auth_invalid`	Invalid API key
401	`mcp_auth_revoked`	API key has been revoked
403	`mcp_policy_denied`	Request denied by policy (server/tool/arg rule)
404	`mcp_server_not_found`	Server ID not found for this tenant
503	`mcp_server_unavailable`	Server is suspended or disabled
502	`mcp_upstream_error`	Upstream MCP server returned an error

POST /mcp/{serverId}/tools/list

Lists tools available on the specified MCP server by forwarding to the server's tools/list endpoint.

Request

POST /mcp/code-search/tools/list
Authorization: Bearer gw_<api-key>
Content-Type: application/json

{}

Response (200 OK)

Returns the upstream MCP server's tool listing:

{
  "tools": [
    {
      "name": "search_code",
      "description": "Search code repositories",
      "inputSchema": { "type": "object", "properties": { "query": { "type": "string" } } }
    }
  ]
}

POST /mcp/{serverId}/resources/{path}

Accesses a resource on the specified MCP server.

Request

POST /mcp/file-server/resources/read
Authorization: Bearer gw_<api-key>
Content-Type: application/json

{
  "uri": "file:///data/report.csv"
}

POST /mcp/{serverId}/prompts/{path}

Accesses a prompt template on the specified MCP server.

Request

POST /mcp/prompt-server/prompts/summarize
Authorization: Bearer gw_<api-key>
Content-Type: application/json

{
  "arguments": {
    "text": "Long document text..."
  }
}

MCP Proxy Error Envelope

All MCP proxy errors use a unified error envelope with the mcp_error type:

{
  "error": {
    "message": "MCP server not found: code-search",
    "type": "mcp_error",
    "code": "mcp_server_not_found",
    "trace_id": "abc123"
  }
}

All error codes are lowercased and prefixed with mcp_.

MCP Proxy Metrics

The MCP proxy exposes Prometheus metrics at GET /actuator/prometheus:

Metric	Type	Labels
`mcp_requests_total`	Counter	tenant, server_id, operation, status
`mcp_latency_seconds`	Timer (P50/P95/P99)	tenant, server_id, operation
`mcp_errors_total`	Counter	server_id, error_code

Credential Encryption

Provider API keys can be encrypted at rest using AES-256-GCM. To use encrypted credentials:

Set the master password via environment variable: GATEWAY_ENCRYPTION_MASTER_PASSWORD=<password>
Encrypt your API key using the AesEncryptor utility
Set the encrypted value with an ENC: prefix: OPENAI_API_KEY=ENC:<base64-ciphertext>

The gateway transparently decrypts ENC:-prefixed values at runtime using the configured master password. Enterprise deployments can replace the built-in SecretProvider with vault-backed implementations (HashiCorp Vault, AWS Secrets Manager, etc.) for zero-plaintext credential handling.

OpenAPI Specification
Common Headers
POST /v1/chat/completions
- Request
- Request Fields
- Message Object
- Response (200 OK)
- Streaming Response
POST /v1/embeddings
- Request
- Request Fields
- Response (200 OK)
GET /v1/models
- Request
- Response (200 OK)
- Capability Fields
POST /v1/completions (Legacy)
- Request
- Request Fields
- Response (200 OK)
GET /
GET /try
GET /status
GET /actuator/gateway-status
POST /admin/v1/tenants
- Request
- Request Fields
- Response (201 Created)
GET /admin/v1/tenants
- Request
- Response (200 OK)
GET /admin/v1/tenants/{id}
- Request
- Response (200 OK)
- Response (404 Not Found)
PUT /admin/v1/tenants/{id}
- Request
- Request Fields
- Response (200 OK)
- Response (404 Not Found)
DELETE /admin/v1/tenants/{id}
- Request
- Response (204 No Content)
- Response (404 Not Found)
POST /admin/v1/tenants/{tenantId}/keys
- Request
- Request Fields
- Response (201 Created)
- Response (404 Not Found)
GET /admin/v1/tenants/{tenantId}/keys
- Request
- Response (200 OK)
GET /admin/v1/tenants/{tenantId}/keys/{keyId}
- Request
- Response (200 OK)
- Response (404 Not Found)
DELETE /admin/v1/tenants/{tenantId}/keys/{keyId}
- Request
- Response (204 No Content)
- Response (404 Not Found)
POST /admin/v1/tenants/{tenantId}/keys/{keyId}/rotate
- Request
- Response (201 Created)
- Response (404 Not Found)
GET /admin/v1/providers/{id}/capabilities
- Request
- Response (200 OK)
- Response (404 Not Found)
POST /admin/v1/routes
- Request
- Request Fields
- Response (201 Created)
GET /admin/v1/routes
- Request
- Response (200 OK)
GET /admin/v1/routes/{id}
- Request
- Response (200 OK)
- Response (404 Not Found)
PUT /admin/v1/routes/{id}
- Request
- Response (200 OK)
DELETE /admin/v1/routes/{id}
- Request
- Response (204 No Content)
- Response (404 Not Found)
GET /admin/v1/routes/{id}/versions
- Request
- Response (200 OK)
POST /admin/v1/routes/{id}/rollback
- Request
- Response (200 OK)
- Response (404 Not Found)
GET /admin/v1/latency
- Request
- Response (200 OK)
GET /admin/v1/priority/stats
- Request
- Response (200 OK)
POST /admin/v1/policies
- Request
- Response (201 Created)
GET /admin/v1/policies
GET /admin/v1/policies/{id}
PUT /admin/v1/policies/{id}
DELETE /admin/v1/policies/{id}
POST /admin/v1/policies/{id}/status
- Request
GET /admin/v1/policies/{id}/versions
POST /admin/v1/policies/{id}/rollback
- Request
POST /admin/v1/policies/dry-run
- Request
- Response (200 OK)
POST /admin/v1/policies/{id}/promote
- Request
- Response (200 OK)
- Response (400 Bad Request)
GET /admin/v1/policies/{id}/shadow/stats
- Query Parameters
- Response (200 OK)
GET /admin/v1/policies/{id}/shadow/events
- Query Parameters
- Response (200 OK)
Internal Config Distribution API
- GET /internal/v1/config/version
- GET /internal/v1/config/full
POST /admin/v1/pricing
- Request
- Request Fields
- Response (201 Created)
GET /admin/v1/pricing
- Response (200 OK)
GET /admin/v1/pricing/{id}
PUT /admin/v1/pricing/{id}
DELETE /admin/v1/pricing/{id}
GET /admin/v1/costs
- Request
- Response (200 OK)
GET /admin/v1/costs/summary
- Request
- Response (200 OK)
POST /admin/v1/budgets
- Request
- Response (201 Created)
GET /admin/v1/budgets
GET /admin/v1/budgets/{id}
PUT /admin/v1/budgets/{id}
DELETE /admin/v1/budgets/{id}
GET /admin/v1/budgets/{id}/usage
- Response (200 OK)
POST /admin/v1/chargeback
- Request
- Response (201 Created)
- Response (400 Bad Request)
GET /admin/v1/chargeback
- Query Parameters
- Response (200 OK)
GET /admin/v1/chargeback/{id}
- Response (404 Not Found)
GET /admin/v1/chargeback/{id}/pdf
- Response (200 OK)
GET /admin/v1/chargeback/{id}/csv
- Response (200 OK)
DELETE /admin/v1/chargeback/{id}
POST /admin/v1/schemas
- Request
GET /admin/v1/schemas
GET /admin/v1/schemas/{id}
PUT /admin/v1/schemas/{id}
DELETE /admin/v1/schemas/{id}
GET /admin/v1/costs/forecast
- Query Parameters
- Response (200 OK)
GET /admin/v1/costs/anomalies
- Query Parameters
- Response (200 OK)
POST /v1/budget/estimate
- Request
- Response (200 OK)
- Error Responses
GET /admin/v1/audit/events
- Request
- Query Parameters
- Response (200 OK)
GET /admin/v1/audit/events/export
- Request
- Response (200 OK)
GET /admin/v1/audit/events/export/json
- Request
- Response (200 OK)
POST /admin/v1/reports
- Request
- Response (201 Created)
- Error (400 — unlicensed mode)
GET /admin/v1/reports
- Request
- Response (200 OK)
GET /admin/v1/reports/{id}
- Request
- Response (200 OK)
- Error (404)
GET /admin/v1/reports/{id}/pdf
- Request
- Response (200 OK)
DELETE /admin/v1/reports/{id}
- Request
- Response (204 No Content)
POST /admin/v1/pii/detokenize
- Request
- Request Fields
- Response (200 OK)
- RBAC
DELETE /admin/v1/pii/tokens/{tenantId}
- Request
- Response (200 OK)
- RBAC
POST /admin/v1/webhooks
- Request
- Request Fields
- Supported Event Types
- Response (201 Created)
GET /admin/v1/webhooks
- Request
- Query Parameters
- Response (200 OK)
GET /admin/v1/webhooks/{id}
- Request
- Response (200 OK)
- Response (404 Not Found)
PUT /admin/v1/webhooks/{id}
- Request
- Response (200 OK)
- Response (404 Not Found)
DELETE /admin/v1/webhooks/{id}
- Request
- Response (204 No Content)
- Response (404 Not Found)
POST /admin/v1/webhooks/{id}/test
- Request
- Request Fields
- Response (200 OK)
- Response (404 Not Found)
GET /admin/v1/webhooks/{id}/deliveries
- Request
- Response (200 OK)
- Delivery Status Values
- Response (404 Not Found)
POST /v1/webhooks/actions/{action}
- Request
- Path Parameters
- Query Parameters
- Response (200 OK)
- Response (400 Bad Request)
Webhook Delivery Payload
- Delivery Headers
- Payload Structure
- Verifying Webhook Signatures
POST /admin/v1/mcp/servers
- Request
- Request Fields
- Response (201 Created)
- Response (409 Conflict)
GET /admin/v1/mcp/servers
- Request
- Response (200 OK)
GET /admin/v1/mcp/servers/{id}
- Request
- Response (200 OK)
- Response (404 Not Found)
PUT /admin/v1/mcp/servers/{id}
- Request
- Response (200 OK)
- Response (404 Not Found)
DELETE /admin/v1/mcp/servers/{id}
- Request
- Response (204 No Content)
- Response (404 Not Found)
GET /admin/v1/mcp/servers/{id}/health
- Request
- Response (200 OK)
POST /admin/v1/mcp/servers/{id}/tools/sync
- Request
- Response (200 OK)
- Response (400 Bad Request)
POST /admin/v1/mcp/servers/{id}/credentials/rotate
- Request
- Response (200 OK)
- Response (400 Bad Request)
POST /admin/v1/mcp/servers/{id}/credentials/invalidate
- Request
- Response (204 No Content)
- Response (404 Not Found)
POST /mcp/{serverId}/tools/call
- Request
- Headers
- Response (200 OK)
- Error Responses
POST /mcp/{serverId}/tools/list
- Request
- Response (200 OK)
POST /mcp/{serverId}/resources/{path}
- Request
POST /mcp/{serverId}/prompts/{path}
- Request
MCP Proxy Error Envelope
MCP Proxy Metrics
Credential Encryption

OpenAPI Specification​

Common Headers​

POST /v1/chat/completions​

Request​

Request Fields​

Message Object​

Response (200 OK)​

Streaming Response​

POST /v1/embeddings​

Request​

Request Fields​

Response (200 OK)​

GET /v1/models​

Request​

Response (200 OK)​

Capability Fields​

POST /v1/completions (Legacy)​

Request​

Request Fields​

Response (200 OK)​

GET /​

GET /try​

GET /status​

GET /actuator/gateway-status​

POST /admin/v1/tenants​

Request​

Request Fields​

Response (201 Created)​

GET /admin/v1/tenants​

Request​

Response (200 OK)​

GET /admin/v1/tenants/{id}​

Request​

Response (200 OK)​

Response (404 Not Found)​

PUT /admin/v1/tenants/{id}​

Request​

Request Fields​

Response (200 OK)​

Response (404 Not Found)​

DELETE /admin/v1/tenants/{id}​

Request​

Response (204 No Content)​

Response (404 Not Found)​

POST /admin/v1/tenants/{tenantId}/keys​

Request​

Request Fields​

Response (201 Created)​

Response (404 Not Found)​

GET /admin/v1/tenants/{tenantId}/keys​

Request​

Response (200 OK)​

GET /admin/v1/tenants/{tenantId}/keys/{keyId}​

Request​

Response (200 OK)​

Response (404 Not Found)​

DELETE /admin/v1/tenants/{tenantId}/keys/{keyId}​

Request​

Response (204 No Content)​

Response (404 Not Found)​

POST /admin/v1/tenants/{tenantId}/keys/{keyId}/rotate​

Request​

Response (201 Created)​

Response (404 Not Found)​

GET /admin/v1/providers/{id}/capabilities​

Request​

Response (200 OK)​

Response (404 Not Found)​

POST /admin/v1/routes​

Request​

Request Fields​

Response (201 Created)​

GET /admin/v1/routes​

Request​

Response (200 OK)​

GET /admin/v1/routes/{id}​

Request​

Response (200 OK)​

Response (404 Not Found)​

PUT /admin/v1/routes/{id}​

OpenAPI Specification

Common Headers

POST /v1/chat/completions

Request

Request Fields

Message Object

Response (200 OK)

Streaming Response

POST /v1/embeddings

Request

Request Fields

Response (200 OK)

GET /v1/models

Request

Response (200 OK)

Capability Fields

POST /v1/completions (Legacy)

Request

Request Fields

Response (200 OK)

GET /

GET /try

GET /status

GET /actuator/gateway-status

POST /admin/v1/tenants

Request

Request Fields

Response (201 Created)

GET /admin/v1/tenants

Request

Response (200 OK)

GET /admin/v1/tenants/{id}

Request

Response (200 OK)

Response (404 Not Found)

PUT /admin/v1/tenants/{id}

Request

Request Fields

Response (200 OK)

Response (404 Not Found)

DELETE /admin/v1/tenants/{id}

Request

Response (204 No Content)

Response (404 Not Found)

POST /admin/v1/tenants/{tenantId}/keys

Request

Request Fields

Response (201 Created)

Response (404 Not Found)

GET /admin/v1/tenants/{tenantId}/keys

Request

Response (200 OK)

GET /admin/v1/tenants/{tenantId}/keys/{keyId}

Request

Response (200 OK)

Response (404 Not Found)

DELETE /admin/v1/tenants/{tenantId}/keys/{keyId}

Request

Response (204 No Content)

Response (404 Not Found)

POST /admin/v1/tenants/{tenantId}/keys/{keyId}/rotate

Request

Response (201 Created)

Response (404 Not Found)

GET /admin/v1/providers/{id}/capabilities

Request

Response (200 OK)

Response (404 Not Found)

POST /admin/v1/routes

Request

Request Fields

Response (201 Created)

GET /admin/v1/routes

Request

Response (200 OK)

GET /admin/v1/routes/{id}

Request

Response (200 OK)

Response (404 Not Found)

PUT /admin/v1/routes/{id}