Skip to main content

Agentic AI Governance

DVARA provides a comprehensive governance layer for multi-agent workflows — covering tool call visibility, session tracking, loop detection, and human-in-the-loop approval gates. These controls run in the MCP filter chain, are configurable per tenant, and produce forensic-grade audit events.

How It Works

Agentic governance runs as a chain of stages in the MCP Proxy. Every agent tool call passes through authentication, rate limiting, tenant resolution, registry lookup, budget enforcement, policy evaluation, loop detection, approval gating, PII scanning, injection scanning, and audit recording before reaching the upstream MCP server.

1. MCP Tool Call Visibility

Every MCP tool invocation is logged as a tool-call record with full context: actor, tool name, server, latency, HTTP status, PII flags, and policy decision. Records are queryable via admin API and visible in the DVARA Flightdeck.

Admin API

# List tool calls (filterable)
curl http://localhost:8090/v1/admin/mcp/tool-calls?tenant_id=acme

# Filter by server or tool
curl http://localhost:8090/v1/admin/mcp/tool-calls?server_id=code-search
curl http://localhost:8090/v1/admin/mcp/tool-calls?tool_name=search

# Aggregated summary by server + tool
curl http://localhost:8090/v1/admin/mcp/tool-calls/summary?tenant_id=acme

Tool Call Record Fields

FieldDescription
idUnique record identifier
tenant_idTenant that owns the request
session_idAgent session identifier
trace_idDistributed trace identifier
server_idMCP server that handled the call
tool_nameTool that was invoked
operationMCP operation (e.g. tools/call)
user_idAuthenticated user (if available)
policy_decisionPolicy engine result (ALLOW / DENY)
http_statusUpstream HTTP response status
latency_msEnd-to-end latency
response_bytesResponse payload size
is_errorWhether the call resulted in an error
error_codeError code (if error)
pii_in_argsPII detected in request arguments
pii_in_responsePII detected in response
timestampWhen the call occurred

Summary Response

The /summary endpoint aggregates tool calls by server and tool name:

{
"data": [
{
"server_id": "code-search",
"tool_name": "search",
"total_calls": 142,
"error_count": 3,
"avg_latency_ms": 45.2
}
]
}

Prometheus Metrics

MetricTypeLabelsNotes
mcp_tool_calls_totalCountertenant, server_id, tool_name, statusIncremented on every MCP tool call.
mcp_tool_call_latency_secondsHistogramserver_id, tool_nameTool call latency histogram.
mcp_agent_loop_detected_totalCountertenant, loop_typeIncremented when loop detection fires (repetition, cycle, or rate).
mcp_agent_sessions_killed_totalCountertenantIncremented when an agent session is killed.

2. Multi-Agent Chain Tracing

Sessions track the complete lifecycle of an agent's interaction — from first tool call to last. Each session aggregates tool call counts, error counts, latency, and the set of distinct servers and tools used.

How Sessions Work

  1. After every successful tool call, the gateway records the call against a session keyed by the X-Session-Id header.
  2. Sessions are marked "active" based on a configurable TTL (default: 60 minutes since last activity).
  3. Killed sessions are immediately blocked — future tool calls for that session return 403.

Admin API

# List all sessions
curl http://localhost:8090/v1/admin/sessions

# Filter by tenant or active status
curl http://localhost:8090/v1/admin/sessions?tenant_id=acme
curl http://localhost:8090/v1/admin/sessions?active=true

# Get session detail
curl http://localhost:8090/v1/admin/sessions/sess-abc123

# Get session timeline (tool call history)
curl http://localhost:8090/v1/admin/sessions/sess-abc123/timeline

# Kill a session (blocks future tool calls)
curl -X POST http://localhost:8090/v1/admin/sessions/sess-abc123/kill

Session Response

{
"session_id": "sess-abc123",
"tenant_id": "acme",
"first_seen": "2026-03-05T10:00:00Z",
"last_seen": "2026-03-05T10:05:30Z",
"tool_call_count": 15,
"error_count": 1,
"total_latency_ms": 2340,
"distinct_servers": ["code-search", "database"],
"distinct_tools": ["search", "query", "read_file"],
"active": true
}

Kill Switch

When a session is killed via POST /v1/admin/sessions/\{id\}/kill:

  • An AGENT_SESSION_KILLED audit event is written.
  • All subsequent MCP requests for that session ID receive 403 Forbidden with error code mcp_agent_session_killed in the response body.
  • The session is marked inactive.

The response body follows the standard MCP error envelope: {"error": {"code": "mcp_agent_session_killed", "type": "mcp_error", "message": "…", "trace_id": "…"}}.

Configuration

dvara:
mcp-gateway:
agentic:
enabled: true
session-ttl-minutes: 60 # Active session TTL
session-max-capacity: 10000 # Max tracked sessions

Audit Events

Event TypeWhen
AGENT_SESSION_KILLEDSession terminated via kill API

3. Agent Loop Detection & Kill Switch

The loop detector monitors agent behavior per session and triggers when it detects repetitive, cyclical, or excessive tool call patterns. This prevents runaway agents from consuming unlimited tokens or causing unintended side effects.

Detection Patterns

PatternDescriptionDefault Threshold
RepetitionSame tool called N times consecutively5 consecutive calls
CycleA→B→A→B repeating pattern detectedPattern length 2–4, repeated 3+ times
RateExceeds maximum calls per minute in session60 calls/minute

How It Works

  1. The loop detector evaluates every MCP request against per-session history.
  2. Per-session history is maintained in a circular buffer (default: 100 entries).
  3. Three detection algorithms run on each call:
    • Repetition: counts consecutive identical tool keys (serverId::toolName).
    • Cycle: scans for repeating patterns of length 2–4 in the call history.
    • Rate: counts calls within a 60-second sliding window.
  4. On detection: audit event + webhook + optional auto-kill + return 429 Too Many Requests.

Configuration

dvara:
mcp-gateway:
agentic:
loop-detection:
enabled: true
repetition-threshold: 5 # Consecutive same-tool threshold
cycle-max-length: 4 # Max cycle pattern length to check
cycle-repetitions: 3 # Required repetitions to trigger
max-calls-per-minute: 60 # Rate limit per session
auto-kill: false # Auto-kill session on detection
history-size: 100 # Per-session history buffer

Per-Tenant Configuration

Override global settings via tenant metadata:

Metadata KeyTypeDescription
agentic.loop-detection.enabledbooleanEnable/disable for tenant
agentic.loop-detection.repetition-thresholdintOverride repetition threshold
agentic.loop-detection.max-calls-per-minuteintOverride rate limit
agentic.loop-detection.auto-killbooleanOverride auto-kill

Error Response

When a loop is detected, the response follows the standard MCP error envelope:

{
"error": {
"code": "mcp_agent_loop_detected",
"type": "mcp_error",
"message": "Tool 'code-search::search' called 5 times consecutively",
"trace_id": "abc123..."
}
}

HTTP Status: 429 Too Many Requests. All agentic MCP error responses (loop, session-killed, approval-denied, approval-timeout) follow the same envelope shape — lowercase code, type: "mcp_error", and trace_id included.

Auto-Kill

When auto-kill: true is configured, the loop detector automatically kills the offending session upon detection. This means:

  • The current request returns 429 with AGENT_LOOP_DETECTED (the only audit event written at the trigger point)
  • The session is marked killed in the session tracker
  • All future requests for that session return 403 mcp_agent_session_killed — but no further audit event is written from those blocked requests; the AGENT_LOOP_DETECTED event already captured the trigger

The AGENT_SESSION_KILLED audit event is written only when an operator explicitly kills a session via the admin API (POST /v1/admin/sessions/{id}/kill) — not on the auto-kill path. To correlate auto-kills in audit logs, filter for AGENT_LOOP_DETECTED with auto-kill: true configured.

Audit Events

Event TypePayload Fields
AGENT_LOOP_DETECTEDloop_type, session_id, server_id, tool_name, trace_id, message

Prometheus Metrics

Dedicated counters are available for loop detection and session kills:

sum(rate(mcp_agent_loop_detected_total[5m]))
sum(rate(mcp_agent_sessions_killed_total[5m]))

Use the loop_type label on mcp_agent_loop_detected_total to distinguish repetition, cycle, and rate-limit triggers. Audit events (AGENT_LOOP_DETECTED, AGENT_SESSION_KILLED) carry the full forensic detail for incident reconstruction.


4. Human-in-the-Loop Approval Gates

Approval gates allow you to require human sign-off before high-risk tool calls are executed. When a tool call matches an approval rule, the request blocks until a human approves or denies it, or until a timeout expires.

How It Works

  1. The approval gate evaluates every MCP request against tenant-specific approval rules.
  2. Rules match on tool name patterns (glob) and/or server IDs.
  3. When matched:
    • MCP_APPROVAL_REQUESTED audit event is written.
    • A webhook is dispatched with approve/deny URLs carrying HMAC-signed tokens.
    • The request blocks until a decision arrives or the timeout fires.
  4. The webhook recipient (Slack bot, approval UI, etc.) calls the approve/deny URL.
  5. The gateway validates the HMAC token and records the decision.
  6. The blocked request resumes with the decision.

Approval Flow

Agent → MCP Proxy → approval gate
├─ rules match → approval required
├─ audit(MCP_APPROVAL_REQUESTED)
├─ webhook dispatch (approve/deny URLs)
└─ wait for decision
├─ approve → forward to upstream → 200
├─ deny → 403 mcp_approval_denied
└─ timeout → 408 mcp_approval_timeout

Configuration

dvara:
mcp-gateway:
agentic:
approval:
enabled: true
timeout-seconds: 300 # Wait timeout (5 minutes)
default-action: deny # Action on timeout: "deny" or "approve"
max-pending-approvals: 1000 # Max concurrent pending approvals

Per-Tenant Configuration

Approval rules are configured via tenant metadata:

Metadata KeyTypeDescription
approval.required-toolsstringComma-separated glob patterns (e.g. "database_*,file_write")
approval.required-serversstringComma-separated server IDs
approval.timeout-secondsintOverride global timeout
approval.default-actionstringOverride timeout action ("deny" or "approve")

Example: Require Approval for Database Writes

Set the following tenant metadata:

{
"approval.required-tools": "db_write,db_delete,database_*",
"approval.required-servers": "production-db",
"approval.timeout-seconds": 600,
"approval.default-action": "deny"
}

Any tool call matching db_write, db_delete, or database_* patterns, or targeting the production-db server, will require human approval.

Webhook Payload

When approval is required, a webhook is dispatched with approve/deny action URLs. The payload structure (verbatim from WebhookPayloadBuilder):

{
"id": "<delivery-uuid>",
"webhook_id": "<webhook id>",
"timestamp": "2026-03-05T10:00:00Z",
"type": "MCP_APPROVAL_REQUESTED",
"tenant_id": "acme",
"data": {
"event_id": "<event-uuid>",
"server_id": "production-db",
"tool_name": "db_delete",
"session_id": "<session-uuid>",
"trace_id": "<trace-id>",
"user_id": "<user-id-if-available>",
"matched_rules": "[tool:database_*, server:production-db]"
},
"approve_url": "https://gateway.example.com/v1/webhooks/actions/approve?token=<hmac-signed>",
"deny_url": "https://gateway.example.com/v1/webhooks/actions/deny?token=<hmac-signed>"
}

Note on the wire format:

  • The id field at the top level is the delivery UUID (one per webhook delivery attempt), distinct from data.event_id (one per audit event — the event id is stable across retries of the same delivery).
  • tenant_id lives at the top level, not inside data.
  • approve_url / deny_url are at the top level (not nested under an actions object). They are only included when the event type is MCP_APPROVAL_REQUESTED and dvara.llm-gateway.webhooks.approval-base-url is configured.
  • matched_rules is serialized as a string — Java's List.toString() form ("[tool:..., server:...]"), not a JSON array. Receivers that want to parse the rule list need to strip the brackets and split on , .

Error Responses

ScenarioHTTP StatusError Code
Approval denied403mcp_approval_denied
Approval timed out408mcp_approval_timeout
Max pending approvals reached403 (deny)

Audit Events

Event TypeWhen
MCP_APPROVAL_REQUESTEDTool call blocked pending approval
MCP_APPROVAL_GRANTEDApproval received
MCP_APPROVAL_DENIEDDenial received
MCP_APPROVAL_TIMEOUTTimeout expired

Prometheus Metrics

MetricTypeLabels
mcp_approval_requests_totalCountertenant, server_id, tool_name
mcp_approval_granted_totalCountertenant
mcp_approval_denied_totalCountertenant
mcp_approval_timeout_totalCountertenant

DVARA Flightdeck

Tool Calls Page (/mcp/tool-calls)

The tool calls page provides real-time visibility into all MCP tool call activity:

  • Filters: tenant, server, tool name
  • Table: timestamp, server, tool, tenant, session, HTTP status, latency, PII flags
  • Click to expand: full tool call detail (trace ID, operation, policy decision, etc.)
  • Auto-refresh: live polling every 5 seconds

Sessions Page (/mcp/sessions)

The sessions page tracks all agent sessions:

  • Filters: tenant, active-only toggle
  • Table: session ID, tenant, status, tool calls, errors, servers, latency, last seen
  • Detail view: session info + tool call timeline (3-second live polling)
  • Kill button: terminate session with confirmation dialog

Tool calls, sessions, and the approval queue are accessible from the Agents section of the DVARA Flightdeck sidebar.


RBAC

EndpointMethodownerpolicy-adminbilling-admindeveloperviewer
/v1/admin/mcp/tool-calls, /summaryGETYYYY
/v1/admin/sessions, /\{id\}, /\{id\}/timelineGETYYYY
/v1/admin/sessions/\{id\}/killPOSTYY

Capacity and concurrency

The agentic governance layer runs at the full throughput of the DVARA MCP Proxy. Session tracking, loop detection, and approval-gate bookkeeping add negligible overhead to the request path. Approval gates pause the request while waiting for a human decision so a paused approval does not consume CPU or RAM beyond the request itself.

LimitDefaultBehavior when full
Active sessions10000Oldest inactive sessions are evicted
Loop detection history per session100 entriesCircular buffer; oldest entries are dropped
Pending approvals1000Excess requests are auto-denied
Tool call recordsUnbounded (PostgreSQL)Retained until you archive or delete them

All limits are configurable through the properties reference. Tool call records are persisted durably to PostgreSQL, so they survive restarts and can feed compliance reports months after the fact.

Enterprise-only

Agentic governance is an enterprise feature. Without an enterprise license, the session tracker, loop detector, and approval gate are all no-ops and tool call records are not persisted. With a license, the full governance layer activates automatically and the DVARA Flightdeck exposes the sessions, tool calls, approval queue, and analytics pages.


5. Multi-Tenancy Isolation

The MCP Proxy enforces strict tenant isolation at multiple layers:

  1. Tenant context — Any MCP request without a tenant id is rejected with 403 TENANT_REQUIRED (the McpTenantContextFilter builds the response body directly with the uppercase code, bypassing the lowercase+mcp_-prefix transformation other MCP error codes get). The tenant id is also placed in the structured-logging context for every downstream log line.
  2. Registry isolation — Server lookups are scoped to (tenantId, serverId). Tenant A cannot discover, list, or call servers registered by tenant B, even if serverId strings collide.
  3. Tool-call record isolation — Tool-call queries return only records belonging to the calling tenant.
  4. Session isolation — Session queries return only sessions belonging to the calling tenant.
  5. Shared-server pattern — One physical MCP server can be registered under multiple tenants with separate (tenantId, serverId) entries, each with independent policy surfaces and no shared state.

6. Credential Hot-Swap

MCP server credentials can be rotated or invalidated at runtime without gateway downtime.

API Endpoints

# Rotate credentials (optional new credential reference)
curl -X POST http://localhost:8090/v1/admin/mcp/servers/{id}/credentials/rotate \
-H "Content-Type: application/json" \
-d '{"new_credential_ref": "vault://secret/mcp/new-path"}'

# Invalidate cached credentials immediately
curl -X POST http://localhost:8090/v1/admin/mcp/servers/{id}/credentials/invalidate

Rotation Response

{
"success": true,
"message": "Credential rotated successfully",
"old_credential_ref": "vault://secret/mcp/old-path",
"new_credential_ref": "vault://secret/mcp/new-path"
}

How It Works

  1. Rotate: Evicts the old credential from the gateway's secret cache, updates the server's credentialRef, validates the new credential is resolvable, and writes a CREDENTIAL_ROTATED audit event.
  2. Invalidate: Evicts the credential from cache immediately without setting a new one. The next request triggers a fresh fetch from the vault. Returns 204 No Content.

RBAC

Credential rotation and invalidation require the owner role.

EndpointMethodownerpolicy-admindeveloperviewer
/\{id\}/credentials/rotatePOSTY
/\{id\}/credentials/invalidatePOSTY

7. Rich Rate Limit Errors

When rate limits are enforced on MCP tool calls, the 429 response includes actionable detail for intelligent retry/reroute decisions.

Response Format

The rate-limit response nests detail inside the error object. Unlike the other MCP error codes (which get lowercased + mcp_-prefixed by McpExceptionHandler), the rate-limit filter constructs the body directly, so the wire code is the raw uppercase form:

{
"error": {
"type": "rate_limit_error",
"code": "RATE_LIMIT_EXCEEDED",
"message": "Rate limit exceeded",
"rate_limit": {
"limited_resource": "github-api",
"server_id": "github-api",
"limit_type": "requests_per_minute",
"retry_after_seconds": 47,
"alternative_servers": ["github-api-secondary", "github-api-backup"]
}
}
}

alternative_servers only appears when at least one candidate was found; an empty list is omitted from the response rather than serialized as []. Agent orchestrators should read the retry hint from error.rate_limit.retry_after_seconds in the body — the MCP Proxy does not set Retry-After or X-RateLimit-* HTTP headers on rate-limit responses (the value is body-only).

Alternative Servers

alternative_servers lists other ACTIVE MCP servers for the same tenant that share matching tags with the rate-limited server. Up to 5 alternatives are returned. This enables agent orchestrators to automatically reroute to available alternatives instead of blind waiting.


8. Approval Gate Metrics

The approval gate records Prometheus metrics at each decision point:

MetricTypeLabelsWhen
mcp_approval_requests_totalCountertenant, server_id, tool_nameApproval requested
mcp_approval_granted_totalCountertenantApproval granted
mcp_approval_denied_totalCountertenantApproval denied
mcp_approval_timeout_totalCountertenantApproval timed out

All MCP Proxy metrics are exposed at /actuator/prometheus. The MCP Proxy follows the same authentication model as the LLM Gateway — the endpoint requires Authorization: Bearer $DVARA_ACTUATOR_METRICS_API_KEY (the same shared secret as the LLM Gateway's metrics scrape; set it once for the install and both apps validate against it). The MCP Proxy's /actuator/health and /actuator/health/{liveness,readiness} are anonymous and safe for k8s probes; its dangerous endpoints (/env, /heapdump, /threaddump, etc.) are excluded from the actuator registry and return 404 regardless of auth. See Observability → Health Endpoints for the full auth model.