Agentic AI Governance
Dvara Enterprise provides a comprehensive governance layer for multi-agent workflows — covering tool call visibility, session tracking, loop detection, and human-in-the-loop approval gates. These controls run in the MCP filter chain, are configurable per tenant, and produce forensic-grade audit events.
Requires: Enterprise license (signed JWT via GATEWAY_ENTERPRISE_LICENSE_KEY). See license-generator module for key generation.
How It Works
Agentic governance runs as a set of MCP filters in the proxy pipeline. Every agent tool call passes through detection, approval, and recording stages before reaching the upstream MCP server.
MCP Request Flow:
→ McpAuthFilter (100)
→ McpRateLimitFilter (200)
→ McpTenantContextFilter (300)
→ McpRegistryFilter (400)
→ McpBudgetFilter (450)
→ McpPolicyFilter (500)
→ McpLoopDetectionFilter (525) ← Loop detection
→ McpApprovalGateFilter (550) ← Approval gate
→ McpPiiArgsFilter (600)
→ McpInjectionFilter (650)
→ McpAuditPreFilter (700) ← Tool call recording
→ McpSessionRecordingFilter (750) ← Session tracking
→ McpRoutingFilter (800)
1. MCP Tool Call Visibility (E8-S1)
Every MCP tool invocation is logged as a McpToolCallRecord with full context: actor, tool name, server, latency, HTTP status, PII flags, and policy decision. Records are queryable via admin API and visible in the admin UI.
Admin API
# List tool calls (filterable)
curl http://localhost:8080/admin/v1/mcp/tool-calls?tenant_id=acme
# Filter by server or tool
curl http://localhost:8080/admin/v1/mcp/tool-calls?server_id=code-search
curl http://localhost:8080/admin/v1/mcp/tool-calls?tool_name=search
# Aggregated summary by server + tool
curl http://localhost:8080/admin/v1/mcp/tool-calls/summary?tenant_id=acme
Tool Call Record Fields
| Field | Description |
|---|---|
id | Unique record identifier |
tenant_id | Tenant that owns the request |
session_id | Agent session identifier |
trace_id | Distributed trace identifier |
server_id | MCP server that handled the call |
tool_name | Tool that was invoked |
operation | MCP operation (e.g. tools/call) |
user_id | Authenticated user (if available) |
policy_decision | Policy engine result (ALLOW / DENY) |
http_status | Upstream HTTP response status |
latency_ms | End-to-end latency |
response_bytes | Response payload size |
is_error | Whether the call resulted in an error |
error_code | Error code (if error) |
pii_in_args | PII detected in request arguments |
pii_in_response | PII detected in response |
timestamp | When the call occurred |
Summary Response
The /summary endpoint aggregates tool calls by server and tool name:
{
"data": [
{
"server_id": "code-search",
"tool_name": "search",
"total_calls": 142,
"error_count": 3,
"avg_latency_ms": 45.2
}
]
}
Prometheus Metrics
| Metric | Type | Labels |
|---|---|---|
mcp_tool_calls_total | Counter | tenant, server_id, tool_name, status |
mcp_tool_call_latency_seconds | Histogram | server_id, tool_name |
2. Multi-Agent Chain Tracing (E8-S2)
Sessions track the complete lifecycle of an agent's interaction — from first tool call to last. Each session aggregates tool call counts, error counts, latency, and the set of distinct servers and tools used.
How Sessions Work
- The
McpSessionRecordingFilter(order 750) runs after every successful tool call - It creates or updates a
McpSessionkeyed by theX-Session-Idheader - Sessions are marked "active" based on a configurable TTL (default: 60 minutes since last activity)
- Killed sessions are immediately blocked — future tool calls return
403
Admin API
# List all sessions
curl http://localhost:8080/admin/v1/sessions
# Filter by tenant or active status
curl http://localhost:8080/admin/v1/sessions?tenant_id=acme
curl http://localhost:8080/admin/v1/sessions?active=true
# Get session detail
curl http://localhost:8080/admin/v1/sessions/sess-abc123
# Get session timeline (tool call history)
curl http://localhost:8080/admin/v1/sessions/sess-abc123/timeline
# Kill a session (blocks future tool calls)
curl -X POST http://localhost:8080/admin/v1/sessions/sess-abc123/kill
Session Response
{
"session_id": "sess-abc123",
"tenant_id": "acme",
"first_seen": "2026-03-05T10:00:00Z",
"last_seen": "2026-03-05T10:05:30Z",
"tool_call_count": 15,
"error_count": 1,
"total_latency_ms": 2340,
"distinct_servers": ["code-search", "database"],
"distinct_tools": ["search", "query", "read_file"],
"active": true
}
Kill Switch
When a session is killed via POST /admin/v1/sessions/\{id\}/kill:
- An
AGENT_SESSION_KILLEDaudit event is written - All subsequent MCP requests for that session ID receive
403 Forbiddenwith error codeSESSION_KILLED - The session is marked inactive
Configuration
gateway:
agentic:
enabled: true
session-ttl-minutes: 60 # Active session TTL
session-max-capacity: 10000 # Max tracked sessions
Audit Events
| Event Type | When |
|---|---|
AGENT_SESSION_KILLED | Session terminated via kill API |
3. Agent Loop Detection & Kill Switch (E8-S3)
The loop detector monitors agent behavior per session and triggers when it detects repetitive, cyclical, or excessive tool call patterns. This prevents runaway agents from consuming unlimited tokens or causing unintended side effects.
Detection Patterns
| Pattern | Description | Default Threshold |
|---|---|---|
| Repetition | Same tool called N times consecutively | 5 consecutive calls |
| Cycle | A→B→A→B repeating pattern detected | Pattern length 2–4, repeated 3+ times |
| Rate | Exceeds maximum calls per minute in session | 60 calls/minute |
How It Works
McpLoopDetectionFilter(order 525) evaluates every MCP request- Per-session history is maintained in a circular buffer (default: 100 entries)
- Three detection algorithms run on each call:
- Repetition: counts consecutive identical tool keys (
serverId::toolName) - Cycle: scans for repeating patterns of length 2–4 in the call history
- Rate: counts calls within a 60-second sliding window
- Repetition: counts consecutive identical tool keys (
- On detection: audit event + webhook + optional auto-kill + return
429 Too Many Requests
Configuration
gateway:
agentic:
loop-detection:
enabled: true
repetition-threshold: 5 # Consecutive same-tool threshold
cycle-max-length: 4 # Max cycle pattern length to check
cycle-repetitions: 3 # Required repetitions to trigger
max-calls-per-minute: 60 # Rate limit per session
auto-kill: false # Auto-kill session on detection
history-size: 100 # Per-session history buffer
Per-Tenant Configuration
Override global settings via tenant metadata:
| Metadata Key | Type | Description |
|---|---|---|
agentic.loop-detection.enabled | boolean | Enable/disable for tenant |
agentic.loop-detection.repetition-threshold | int | Override repetition threshold |
agentic.loop-detection.max-calls-per-minute | int | Override rate limit |
agentic.loop-detection.auto-kill | boolean | Override auto-kill |
Error Response
When a loop is detected:
{
"error": {
"type": "loop_detection_error",
"code": "AGENT_LOOP_DETECTED",
"message": "Tool 'code-search::search' called 5 times consecutively"
}
}
HTTP Status: 429 Too Many Requests
Auto-Kill
When auto-kill: true is configured, the loop detector automatically kills the offending session upon detection. This means:
- The current request returns
429 - All future requests for that session return
403 AGENT_SESSION_KILLEDaudit event is written
Audit Events
| Event Type | Payload Fields |
|---|---|
AGENT_LOOP_DETECTED | loop_type, session_id, tool_name, server_id, message |
Prometheus Metrics
| Metric | Type | Labels |
|---|---|---|
mcp_agent_loop_detected_total | Counter | tenant, loop_type |
mcp_agent_sessions_killed_total | Counter | tenant |
4. Human-in-the-Loop Approval Gates (E8-S4)
Approval gates allow you to require human sign-off before high-risk tool calls are executed. When a tool call matches an approval rule, the request blocks (safely on virtual threads) until a human approves or denies it, or until a timeout expires.
How It Works
McpApprovalGateFilter(order 550) evaluates every MCP request against tenant-specific approval rules- Rules match on tool name patterns (glob) and/or server IDs
- When matched:
MCP_APPROVAL_REQUESTEDaudit event is written- A webhook is dispatched with approve/deny URLs (HMAC-signed tokens)
- The request blocks using
CompletableFuture.get(timeout)— safe on virtual threads
- The webhook recipient (Slack bot, approval UI, etc.) calls the approve/deny URL
- The
WebhookApprovalControllervalidates the HMAC token and publishes aWebhookApprovalEvent - The
ApprovalEventListenerbridges the event toApprovalGate.recordDecision() - The blocked request resumes with the decision
Approval Flow
Agent → MCP Proxy → McpApprovalGateFilter
├─ evaluate() → rules match → APPROVAL_REQUIRED
├─ audit(MCP_APPROVAL_REQUESTED)
├─ webhook dispatch (with approve/deny URLs)
└─ awaitDecision(timeout)
├─ APPROVE → chain.doFilter() → 200
├─ DENY → 403 MCP_APPROVAL_DENIED
└─ TIMEOUT → 408 MCP_APPROVAL_TIMEOUT
Configuration
gateway:
agentic:
approval:
enabled: true
timeout-seconds: 300 # Wait timeout (5 minutes)
default-action: deny # Action on timeout: "deny" or "approve"
max-pending-approvals: 1000 # Max concurrent pending approvals
Per-Tenant Configuration
Approval rules are configured via tenant metadata:
| Metadata Key | Type | Description |
|---|---|---|
approval.required-tools | string | Comma-separated glob patterns (e.g. "database_*,file_write") |
approval.required-servers | string | Comma-separated server IDs |
approval.timeout-seconds | int | Override global timeout |
approval.default-action | string | Override timeout action ("deny" or "approve") |
Example: Require Approval for Database Writes
Set the following tenant metadata:
{
"approval.required-tools": "db_write,db_delete,database_*",
"approval.required-servers": "production-db",
"approval.timeout-seconds": 600,
"approval.default-action": "deny"
}
Any tool call matching db_write, db_delete, or database_* patterns, or targeting the production-db server, will require human approval.
Webhook Payload
When approval is required, a webhook is dispatched with approve/deny action URLs:
{
"delivery_id": "d-uuid",
"event_type": "MCP_APPROVAL_REQUESTED",
"data": {
"event_id": "evt-uuid",
"server_id": "production-db",
"tool_name": "db_delete",
"tenant_id": "acme",
"matched_rules": ["tool:database_*", "server:production-db"]
},
"actions": {
"approve_url": "https://gateway.example.com/v1/webhooks/actions/approve?token=<hmac-signed>",
"deny_url": "https://gateway.example.com/v1/webhooks/actions/deny?token=<hmac-signed>"
}
}
Error Responses
| Scenario | HTTP Status | Error Code |
|---|---|---|
| Approval denied | 403 | MCP_APPROVAL_DENIED |
| Approval timed out | 408 | MCP_APPROVAL_TIMEOUT |
| Max pending approvals reached | 403 (deny) | — |
Audit Events
| Event Type | When |
|---|---|
MCP_APPROVAL_REQUESTED | Tool call blocked pending approval |
MCP_APPROVAL_GRANTED | Approval received |
MCP_APPROVAL_DENIED | Denial received |
MCP_APPROVAL_TIMEOUT | Timeout expired |
Prometheus Metrics
| Metric | Type | Labels |
|---|---|---|
mcp_approval_requests_total | Counter | tenant, server_id, tool_name |
mcp_approval_granted_total | Counter | tenant |
mcp_approval_denied_total | Counter | tenant |
mcp_approval_timeout_total | Counter | tenant |
Admin UI
Tool Calls Page (/mcp/tool-calls)
The tool calls page provides real-time visibility into all MCP tool call activity:
- Filters: tenant, server, tool name
- Table: timestamp, server, tool, tenant, session, HTTP status, latency, PII flags
- Click to expand: full tool call detail (trace ID, operation, policy decision, etc.)
- Auto-refresh: HTMX polling every 5 seconds
Sessions Page (/mcp/sessions)
The sessions page tracks all agent sessions:
- Filters: tenant, active-only toggle
- Table: session ID, tenant, status, tool calls, errors, servers, latency, last seen
- Detail view: session info + tool call timeline (3-second HTMX polling)
- Kill button: terminate session with confirmation dialog
Navigation
Tool calls and sessions are accessible under the Agents dropdown in the navigation bar (visible when enterprise license is active).
RBAC
| Endpoint | Method | org-admin | policy-admin | billing-admin | developer | viewer |
|---|---|---|---|---|---|---|
/admin/v1/mcp/tool-calls, /summary | GET | Y | Y | Y | Y | |
/admin/v1/sessions, /\{id\}, /\{id\}/timeline | GET | Y | Y | Y | Y | |
/admin/v1/sessions/\{id\}/kill | POST | Y | Y |
Architecture
Thread Safety
- Session tracking:
ConcurrentHashMapwithAtomicInteger/AtomicLongcounters for lock-free updates - Loop detection: Per-session
SessionHistorysynchronized on access;ConcurrentHashMapfor session isolation - Approval gates:
CompletableFuture.get(timeout)parks the virtual thread, consuming minimal resources; safe withspring.threads.virtual.enabled=true
Capacity Management
- Sessions: Configurable max capacity (default 10,000). When full, oldest inactive sessions are evicted.
- Loop history: Fixed-size circular buffer (default 100) per session. Old entries are dropped.
- Pending approvals: Configurable max (default 1,000). Excess requests are auto-denied.
- Tool call records: In-memory repository with 10,000 record cap (FIFO eviction).
Enterprise Override Pattern
All governance interfaces follow the default/Enterprise pattern:
| Interface | Default | Licensed Override |
|---|---|---|
McpSessionTracker | NoOpMcpSessionTracker | InMemoryMcpSessionTracker |
LoopDetector | NoOpLoopDetector | EnterpriseLoopDetector |
ApprovalGate | NoOpApprovalGate | EnterpriseApprovalGate |
McpToolCallRepository | InMemoryMcpToolCallRepository | — (same in-memory impl used) |
Default implementations return empty/false/NOT_REQUIRED — zero overhead when enterprise features are not licensed.
5. Multi-Tenancy Isolation (E13-S11)
The MCP proxy enforces strict tenant isolation at multiple layers:
-
Tenant Context Filter —
McpTenantContextFilter(order 300) rejects any request without atenantIdwith403 TENANT_REQUIRED. Sets SLF4J MDCtenant_idfor structured logging with try-finally cleanup. -
Registry Isolation — Server lookups are scoped to
(tenantId, serverId). Tenant A cannot discover, list, or call servers registered by tenant B, even ifserverIdstrings collide. -
Tool Call Record Isolation —
McpToolCallRepositoryqueries (findByTenantId()) return only records belonging to the specified tenant. -
Session Isolation —
McpSessionTracker.getSessionsByTenantId()returns only sessions belonging to the queried tenant. -
Shared-Server Pattern — One physical MCP server can be registered under multiple tenants with separate
(tenantId, serverId)entries, each with independent policy surfaces and no shared state.
Integration Tests
The McpMultiTenantIsolationTest suite validates all isolation guarantees:
- Tenant A cannot see tenant B's servers
- Same
serverIdin different tenants resolves independently - Tool call records are isolated by tenant
- Session tracker isolates by tenant
- Tenant context filter rejects requests without tenant
- Shared-server pattern works correctly
6. Credential Hot-Swap (E13-S9)
MCP server credentials can be rotated or invalidated at runtime without gateway downtime.
API Endpoints
# Rotate credentials (optional new credential reference)
curl -X POST http://localhost:8080/admin/v1/mcp/servers/{id}/credentials/rotate \
-H "Content-Type: application/json" \
-d '{"new_credential_ref": "vault://secret/mcp/new-path"}'
# Invalidate cached credentials immediately
curl -X POST http://localhost:8080/admin/v1/mcp/servers/{id}/credentials/invalidate
Rotation Response
{
"success": true,
"message": "Credential rotated successfully",
"old_credential_ref": "vault://secret/mcp/old-path",
"new_credential_ref": "vault://secret/mcp/new-path"
}
How It Works
- Rotate: Evicts the old credential from the
SecretProvidercache viaevictCached(), updates the server'scredentialRef, validates the new credential is resolvable, and writes aCREDENTIAL_ROTATEDaudit event. - Invalidate: Evicts the credential from cache immediately without setting a new one. The next request triggers a fresh fetch from the vault. Returns
204 No Content.
RBAC
Credential rotation and invalidation require the org-admin role.
| Endpoint | Method | org-admin | policy-admin | developer | viewer |
|---|---|---|---|---|---|
/\{id\}/credentials/rotate | POST | Y | |||
/\{id\}/credentials/invalidate | POST | Y |
7. Rich Rate Limit Errors (E13-S10)
When rate limits are enforced on MCP tool calls, the 429 response includes actionable detail for intelligent retry/reroute decisions.
Response Format
{
"error": {
"type": "rate_limit_error",
"code": "rate_limited",
"message": "Rate limit exceeded"
},
"rate_limit": {
"limited_resource": "mcp_server",
"server_id": "github-api",
"limit_type": "requests_per_minute",
"limit": 60,
"remaining": 0,
"retry_after_seconds": 47,
"alternative_servers": ["github-api-secondary", "github-api-backup"]
}
}
Alternative Servers
alternative_servers lists other ACTIVE MCP servers for the same tenant that share matching tags with the rate-limited server. This enables agent orchestrators to automatically reroute to available alternatives instead of blind waiting.
The McpServerAlternativeFinder queries the McpServerRepository for matching servers (up to 5 alternatives).
Response Headers
Rate limit responses also include standard headers for backward compatibility:
| Header | Description |
|---|---|
Retry-After | Seconds until the rate limit resets |
X-RateLimit-Retry-After-Seconds | Same as Retry-After (for programmatic access) |
X-RateLimit-Reset | ISO-8601 timestamp when the limit resets |
8. Approval Gate Metrics (E13-S7)
The approval gate now records Prometheus metrics at each decision point via the McpApprovalMetricsListener callback interface:
| Metric | Type | Labels | When |
|---|---|---|---|
mcp_approval_requests_total | Counter | tenant, server_id, tool_name | Approval requested |
mcp_approval_granted_total | Counter | tenant | Approval granted |
mcp_approval_denied_total | Counter | tenant | Approval denied |
mcp_approval_timeout_total | Counter | tenant | Approval timed out |
These metrics are recorded by McpMetrics in mcp-proxy-server and exposed at /actuator/prometheus.