Skip to main content

Agentic AI Governance

Dvara Enterprise provides a comprehensive governance layer for multi-agent workflows — covering tool call visibility, session tracking, loop detection, and human-in-the-loop approval gates. These controls run in the MCP filter chain, are configurable per tenant, and produce forensic-grade audit events.

Requires: Enterprise license (signed JWT via GATEWAY_ENTERPRISE_LICENSE_KEY). See license-generator module for key generation.

How It Works

Agentic governance runs as a set of MCP filters in the proxy pipeline. Every agent tool call passes through detection, approval, and recording stages before reaching the upstream MCP server.

MCP Request Flow:
→ McpAuthFilter (100)
→ McpRateLimitFilter (200)
→ McpTenantContextFilter (300)
→ McpRegistryFilter (400)
→ McpBudgetFilter (450)
→ McpPolicyFilter (500)
→ McpLoopDetectionFilter (525) ← Loop detection
→ McpApprovalGateFilter (550) ← Approval gate
→ McpPiiArgsFilter (600)
→ McpInjectionFilter (650)
→ McpAuditPreFilter (700) ← Tool call recording
→ McpSessionRecordingFilter (750) ← Session tracking
→ McpRoutingFilter (800)

1. MCP Tool Call Visibility (E8-S1)

Every MCP tool invocation is logged as a McpToolCallRecord with full context: actor, tool name, server, latency, HTTP status, PII flags, and policy decision. Records are queryable via admin API and visible in the admin UI.

Admin API

# List tool calls (filterable)
curl http://localhost:8080/admin/v1/mcp/tool-calls?tenant_id=acme

# Filter by server or tool
curl http://localhost:8080/admin/v1/mcp/tool-calls?server_id=code-search
curl http://localhost:8080/admin/v1/mcp/tool-calls?tool_name=search

# Aggregated summary by server + tool
curl http://localhost:8080/admin/v1/mcp/tool-calls/summary?tenant_id=acme

Tool Call Record Fields

FieldDescription
idUnique record identifier
tenant_idTenant that owns the request
session_idAgent session identifier
trace_idDistributed trace identifier
server_idMCP server that handled the call
tool_nameTool that was invoked
operationMCP operation (e.g. tools/call)
user_idAuthenticated user (if available)
policy_decisionPolicy engine result (ALLOW / DENY)
http_statusUpstream HTTP response status
latency_msEnd-to-end latency
response_bytesResponse payload size
is_errorWhether the call resulted in an error
error_codeError code (if error)
pii_in_argsPII detected in request arguments
pii_in_responsePII detected in response
timestampWhen the call occurred

Summary Response

The /summary endpoint aggregates tool calls by server and tool name:

{
"data": [
{
"server_id": "code-search",
"tool_name": "search",
"total_calls": 142,
"error_count": 3,
"avg_latency_ms": 45.2
}
]
}

Prometheus Metrics

MetricTypeLabels
mcp_tool_calls_totalCountertenant, server_id, tool_name, status
mcp_tool_call_latency_secondsHistogramserver_id, tool_name

2. Multi-Agent Chain Tracing (E8-S2)

Sessions track the complete lifecycle of an agent's interaction — from first tool call to last. Each session aggregates tool call counts, error counts, latency, and the set of distinct servers and tools used.

How Sessions Work

  1. The McpSessionRecordingFilter (order 750) runs after every successful tool call
  2. It creates or updates a McpSession keyed by the X-Session-Id header
  3. Sessions are marked "active" based on a configurable TTL (default: 60 minutes since last activity)
  4. Killed sessions are immediately blocked — future tool calls return 403

Admin API

# List all sessions
curl http://localhost:8080/admin/v1/sessions

# Filter by tenant or active status
curl http://localhost:8080/admin/v1/sessions?tenant_id=acme
curl http://localhost:8080/admin/v1/sessions?active=true

# Get session detail
curl http://localhost:8080/admin/v1/sessions/sess-abc123

# Get session timeline (tool call history)
curl http://localhost:8080/admin/v1/sessions/sess-abc123/timeline

# Kill a session (blocks future tool calls)
curl -X POST http://localhost:8080/admin/v1/sessions/sess-abc123/kill

Session Response

{
"session_id": "sess-abc123",
"tenant_id": "acme",
"first_seen": "2026-03-05T10:00:00Z",
"last_seen": "2026-03-05T10:05:30Z",
"tool_call_count": 15,
"error_count": 1,
"total_latency_ms": 2340,
"distinct_servers": ["code-search", "database"],
"distinct_tools": ["search", "query", "read_file"],
"active": true
}

Kill Switch

When a session is killed via POST /admin/v1/sessions/\{id\}/kill:

  • An AGENT_SESSION_KILLED audit event is written
  • All subsequent MCP requests for that session ID receive 403 Forbidden with error code SESSION_KILLED
  • The session is marked inactive

Configuration

gateway:
agentic:
enabled: true
session-ttl-minutes: 60 # Active session TTL
session-max-capacity: 10000 # Max tracked sessions

Audit Events

Event TypeWhen
AGENT_SESSION_KILLEDSession terminated via kill API

3. Agent Loop Detection & Kill Switch (E8-S3)

The loop detector monitors agent behavior per session and triggers when it detects repetitive, cyclical, or excessive tool call patterns. This prevents runaway agents from consuming unlimited tokens or causing unintended side effects.

Detection Patterns

PatternDescriptionDefault Threshold
RepetitionSame tool called N times consecutively5 consecutive calls
CycleA→B→A→B repeating pattern detectedPattern length 2–4, repeated 3+ times
RateExceeds maximum calls per minute in session60 calls/minute

How It Works

  1. McpLoopDetectionFilter (order 525) evaluates every MCP request
  2. Per-session history is maintained in a circular buffer (default: 100 entries)
  3. Three detection algorithms run on each call:
    • Repetition: counts consecutive identical tool keys (serverId::toolName)
    • Cycle: scans for repeating patterns of length 2–4 in the call history
    • Rate: counts calls within a 60-second sliding window
  4. On detection: audit event + webhook + optional auto-kill + return 429 Too Many Requests

Configuration

gateway:
agentic:
loop-detection:
enabled: true
repetition-threshold: 5 # Consecutive same-tool threshold
cycle-max-length: 4 # Max cycle pattern length to check
cycle-repetitions: 3 # Required repetitions to trigger
max-calls-per-minute: 60 # Rate limit per session
auto-kill: false # Auto-kill session on detection
history-size: 100 # Per-session history buffer

Per-Tenant Configuration

Override global settings via tenant metadata:

Metadata KeyTypeDescription
agentic.loop-detection.enabledbooleanEnable/disable for tenant
agentic.loop-detection.repetition-thresholdintOverride repetition threshold
agentic.loop-detection.max-calls-per-minuteintOverride rate limit
agentic.loop-detection.auto-killbooleanOverride auto-kill

Error Response

When a loop is detected:

{
"error": {
"type": "loop_detection_error",
"code": "AGENT_LOOP_DETECTED",
"message": "Tool 'code-search::search' called 5 times consecutively"
}
}

HTTP Status: 429 Too Many Requests

Auto-Kill

When auto-kill: true is configured, the loop detector automatically kills the offending session upon detection. This means:

  • The current request returns 429
  • All future requests for that session return 403
  • AGENT_SESSION_KILLED audit event is written

Audit Events

Event TypePayload Fields
AGENT_LOOP_DETECTEDloop_type, session_id, tool_name, server_id, message

Prometheus Metrics

MetricTypeLabels
mcp_agent_loop_detected_totalCountertenant, loop_type
mcp_agent_sessions_killed_totalCountertenant

4. Human-in-the-Loop Approval Gates (E8-S4)

Approval gates allow you to require human sign-off before high-risk tool calls are executed. When a tool call matches an approval rule, the request blocks (safely on virtual threads) until a human approves or denies it, or until a timeout expires.

How It Works

  1. McpApprovalGateFilter (order 550) evaluates every MCP request against tenant-specific approval rules
  2. Rules match on tool name patterns (glob) and/or server IDs
  3. When matched:
    • MCP_APPROVAL_REQUESTED audit event is written
    • A webhook is dispatched with approve/deny URLs (HMAC-signed tokens)
    • The request blocks using CompletableFuture.get(timeout) — safe on virtual threads
  4. The webhook recipient (Slack bot, approval UI, etc.) calls the approve/deny URL
  5. The WebhookApprovalController validates the HMAC token and publishes a WebhookApprovalEvent
  6. The ApprovalEventListener bridges the event to ApprovalGate.recordDecision()
  7. The blocked request resumes with the decision

Approval Flow

Agent → MCP Proxy → McpApprovalGateFilter
├─ evaluate() → rules match → APPROVAL_REQUIRED
├─ audit(MCP_APPROVAL_REQUESTED)
├─ webhook dispatch (with approve/deny URLs)
└─ awaitDecision(timeout)
├─ APPROVE → chain.doFilter() → 200
├─ DENY → 403 MCP_APPROVAL_DENIED
└─ TIMEOUT → 408 MCP_APPROVAL_TIMEOUT

Configuration

gateway:
agentic:
approval:
enabled: true
timeout-seconds: 300 # Wait timeout (5 minutes)
default-action: deny # Action on timeout: "deny" or "approve"
max-pending-approvals: 1000 # Max concurrent pending approvals

Per-Tenant Configuration

Approval rules are configured via tenant metadata:

Metadata KeyTypeDescription
approval.required-toolsstringComma-separated glob patterns (e.g. "database_*,file_write")
approval.required-serversstringComma-separated server IDs
approval.timeout-secondsintOverride global timeout
approval.default-actionstringOverride timeout action ("deny" or "approve")

Example: Require Approval for Database Writes

Set the following tenant metadata:

{
"approval.required-tools": "db_write,db_delete,database_*",
"approval.required-servers": "production-db",
"approval.timeout-seconds": 600,
"approval.default-action": "deny"
}

Any tool call matching db_write, db_delete, or database_* patterns, or targeting the production-db server, will require human approval.

Webhook Payload

When approval is required, a webhook is dispatched with approve/deny action URLs:

{
"delivery_id": "d-uuid",
"event_type": "MCP_APPROVAL_REQUESTED",
"data": {
"event_id": "evt-uuid",
"server_id": "production-db",
"tool_name": "db_delete",
"tenant_id": "acme",
"matched_rules": ["tool:database_*", "server:production-db"]
},
"actions": {
"approve_url": "https://gateway.example.com/v1/webhooks/actions/approve?token=<hmac-signed>",
"deny_url": "https://gateway.example.com/v1/webhooks/actions/deny?token=<hmac-signed>"
}
}

Error Responses

ScenarioHTTP StatusError Code
Approval denied403MCP_APPROVAL_DENIED
Approval timed out408MCP_APPROVAL_TIMEOUT
Max pending approvals reached403 (deny)

Audit Events

Event TypeWhen
MCP_APPROVAL_REQUESTEDTool call blocked pending approval
MCP_APPROVAL_GRANTEDApproval received
MCP_APPROVAL_DENIEDDenial received
MCP_APPROVAL_TIMEOUTTimeout expired

Prometheus Metrics

MetricTypeLabels
mcp_approval_requests_totalCountertenant, server_id, tool_name
mcp_approval_granted_totalCountertenant
mcp_approval_denied_totalCountertenant
mcp_approval_timeout_totalCountertenant

Admin UI

Tool Calls Page (/mcp/tool-calls)

The tool calls page provides real-time visibility into all MCP tool call activity:

  • Filters: tenant, server, tool name
  • Table: timestamp, server, tool, tenant, session, HTTP status, latency, PII flags
  • Click to expand: full tool call detail (trace ID, operation, policy decision, etc.)
  • Auto-refresh: HTMX polling every 5 seconds

Sessions Page (/mcp/sessions)

The sessions page tracks all agent sessions:

  • Filters: tenant, active-only toggle
  • Table: session ID, tenant, status, tool calls, errors, servers, latency, last seen
  • Detail view: session info + tool call timeline (3-second HTMX polling)
  • Kill button: terminate session with confirmation dialog

Tool calls and sessions are accessible under the Agents dropdown in the navigation bar (visible when enterprise license is active).


RBAC

EndpointMethodorg-adminpolicy-adminbilling-admindeveloperviewer
/admin/v1/mcp/tool-calls, /summaryGETYYYY
/admin/v1/sessions, /\{id\}, /\{id\}/timelineGETYYYY
/admin/v1/sessions/\{id\}/killPOSTYY

Architecture

Thread Safety

  • Session tracking: ConcurrentHashMap with AtomicInteger/AtomicLong counters for lock-free updates
  • Loop detection: Per-session SessionHistory synchronized on access; ConcurrentHashMap for session isolation
  • Approval gates: CompletableFuture.get(timeout) parks the virtual thread, consuming minimal resources; safe with spring.threads.virtual.enabled=true

Capacity Management

  • Sessions: Configurable max capacity (default 10,000). When full, oldest inactive sessions are evicted.
  • Loop history: Fixed-size circular buffer (default 100) per session. Old entries are dropped.
  • Pending approvals: Configurable max (default 1,000). Excess requests are auto-denied.
  • Tool call records: In-memory repository with 10,000 record cap (FIFO eviction).

Enterprise Override Pattern

All governance interfaces follow the default/Enterprise pattern:

InterfaceDefaultLicensed Override
McpSessionTrackerNoOpMcpSessionTrackerInMemoryMcpSessionTracker
LoopDetectorNoOpLoopDetectorEnterpriseLoopDetector
ApprovalGateNoOpApprovalGateEnterpriseApprovalGate
McpToolCallRepositoryInMemoryMcpToolCallRepository— (same in-memory impl used)

Default implementations return empty/false/NOT_REQUIRED — zero overhead when enterprise features are not licensed.


5. Multi-Tenancy Isolation (E13-S11)

The MCP proxy enforces strict tenant isolation at multiple layers:

  1. Tenant Context FilterMcpTenantContextFilter (order 300) rejects any request without a tenantId with 403 TENANT_REQUIRED. Sets SLF4J MDC tenant_id for structured logging with try-finally cleanup.

  2. Registry Isolation — Server lookups are scoped to (tenantId, serverId). Tenant A cannot discover, list, or call servers registered by tenant B, even if serverId strings collide.

  3. Tool Call Record IsolationMcpToolCallRepository queries (findByTenantId()) return only records belonging to the specified tenant.

  4. Session IsolationMcpSessionTracker.getSessionsByTenantId() returns only sessions belonging to the queried tenant.

  5. Shared-Server Pattern — One physical MCP server can be registered under multiple tenants with separate (tenantId, serverId) entries, each with independent policy surfaces and no shared state.

Integration Tests

The McpMultiTenantIsolationTest suite validates all isolation guarantees:

  • Tenant A cannot see tenant B's servers
  • Same serverId in different tenants resolves independently
  • Tool call records are isolated by tenant
  • Session tracker isolates by tenant
  • Tenant context filter rejects requests without tenant
  • Shared-server pattern works correctly

6. Credential Hot-Swap (E13-S9)

MCP server credentials can be rotated or invalidated at runtime without gateway downtime.

API Endpoints

# Rotate credentials (optional new credential reference)
curl -X POST http://localhost:8080/admin/v1/mcp/servers/{id}/credentials/rotate \
-H "Content-Type: application/json" \
-d '{"new_credential_ref": "vault://secret/mcp/new-path"}'

# Invalidate cached credentials immediately
curl -X POST http://localhost:8080/admin/v1/mcp/servers/{id}/credentials/invalidate

Rotation Response

{
"success": true,
"message": "Credential rotated successfully",
"old_credential_ref": "vault://secret/mcp/old-path",
"new_credential_ref": "vault://secret/mcp/new-path"
}

How It Works

  1. Rotate: Evicts the old credential from the SecretProvider cache via evictCached(), updates the server's credentialRef, validates the new credential is resolvable, and writes a CREDENTIAL_ROTATED audit event.
  2. Invalidate: Evicts the credential from cache immediately without setting a new one. The next request triggers a fresh fetch from the vault. Returns 204 No Content.

RBAC

Credential rotation and invalidation require the org-admin role.

EndpointMethodorg-adminpolicy-admindeveloperviewer
/\{id\}/credentials/rotatePOSTY
/\{id\}/credentials/invalidatePOSTY

7. Rich Rate Limit Errors (E13-S10)

When rate limits are enforced on MCP tool calls, the 429 response includes actionable detail for intelligent retry/reroute decisions.

Response Format

{
"error": {
"type": "rate_limit_error",
"code": "rate_limited",
"message": "Rate limit exceeded"
},
"rate_limit": {
"limited_resource": "mcp_server",
"server_id": "github-api",
"limit_type": "requests_per_minute",
"limit": 60,
"remaining": 0,
"retry_after_seconds": 47,
"alternative_servers": ["github-api-secondary", "github-api-backup"]
}
}

Alternative Servers

alternative_servers lists other ACTIVE MCP servers for the same tenant that share matching tags with the rate-limited server. This enables agent orchestrators to automatically reroute to available alternatives instead of blind waiting.

The McpServerAlternativeFinder queries the McpServerRepository for matching servers (up to 5 alternatives).

Response Headers

Rate limit responses also include standard headers for backward compatibility:

HeaderDescription
Retry-AfterSeconds until the rate limit resets
X-RateLimit-Retry-After-SecondsSame as Retry-After (for programmatic access)
X-RateLimit-ResetISO-8601 timestamp when the limit resets

8. Approval Gate Metrics (E13-S7)

The approval gate now records Prometheus metrics at each decision point via the McpApprovalMetricsListener callback interface:

MetricTypeLabelsWhen
mcp_approval_requests_totalCountertenant, server_id, tool_nameApproval requested
mcp_approval_granted_totalCountertenantApproval granted
mcp_approval_denied_totalCountertenantApproval denied
mcp_approval_timeout_totalCountertenantApproval timed out

These metrics are recorded by McpMetrics in mcp-proxy-server and exposed at /actuator/prometheus.