Skip to main content

End-to-End Examples

Complete examples for every endpoint and common use cases.

Chat with OpenAI GPT-4o

export OPENAI_API_KEY=sk-...
./mvnw -pl gateway-server spring-boot:run &

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a concise assistant."},
{"role": "user", "content": "Explain virtual threads in one sentence."}
],
"max_tokens": 80
}'

Chat with Anthropic Claude

export ANTHROPIC_API_KEY=sk-ant-...

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-5",
"messages": [
{"role": "user", "content": "What is 2 + 2?"}
]
}'

Chat with Google Gemini

export GEMINI_API_KEY=AIza...

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.0-flash",
"messages": [
{"role": "user", "content": "What is the capital of Japan?"}
],
"max_tokens": 256
}'

Chat with AWS Bedrock

Any model hosted on Bedrock can be used. Prefix the Bedrock model ID with bedrock/:

export BEDROCK_ENABLED=true
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=us-east-1

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
"messages": [
{"role": "user", "content": "What is 2 + 2?"}
]
}'

Other Bedrock model examples:

bedrock/amazon.titan-text-express-v1
bedrock/meta.llama3-70b-instruct-v1:0
bedrock/mistral.mistral-7b-instruct-v0:2

Local Ollama (llama3)

# Start Ollama and pull a model first:
# ollama serve
# ollama pull llama3.2

OLLAMA_ENABLED=true ./mvnw -pl gateway-server spring-boot:run &

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ollama/llama3.2",
"messages": [
{"role": "user", "content": "Hello from the gateway!"}
]
}'

Mock Provider (Testing / CI)

MOCK_PROVIDER_ENABLED=true ./mvnw -pl gateway-server spring-boot:run &

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "mock/test-model",
"messages": [
{"role": "user", "content": "Hello from mock!"}
]
}'

Generate an Embedding

curl -s -X POST http://localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-ada-002",
"input": "Dvara AI Gateway"
}' | python3 -c "
import sys, json
d = json.load(sys.stdin)
vec = d['data'][0]['embedding']
print(f'Dimensions: {len(vec)}')
print(f'First 5 values: {vec[:5]}')
"

Streaming Chat (SSE)

curl -s -N -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Count to five slowly."}
],
"stream": true
}'

Structured Output — JSON Object

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "List 3 programming languages as a JSON array."}
],
"response_format": {"type": "json_object"}
}'

Structured Output — JSON Schema

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Extract: John is 30 years old and lives in Paris."}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "person",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"city": {"type": "string"}
},
"required": ["name", "age", "city"]
},
"strict": true
}
}
}'

Structured Output with Anthropic

The same response_format works transparently — the gateway translates to Anthropic's tool-use API:

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-5",
"messages": [
{"role": "user", "content": "Extract: Jane is 25 and works as an engineer."}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "person",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"occupation": {"type": "string"}
},
"required": ["name", "age", "occupation"]
},
"strict": true
}
}
}'

Check for the X-Gateway-Strict-Downgraded: true header in the response — Anthropic does not natively enforce strict schemas.

Python — OpenAI SDK (Drop-In Replacement)

from openai import OpenAI

client = OpenAI(
api_key="sk-my-app-key",
base_url="http://localhost:8080/v1"
)

# Switch providers by changing only the model
for model in ["gpt-4o", "claude-sonnet-4-5", "gemini-2.0-flash"]:
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "Hello from Dvara!"}]
)
print(f"{model}: {response.choices[0].message.content}")

Node.js — OpenAI SDK (Drop-In Replacement)

import OpenAI from "openai";

const client = new OpenAI({
apiKey: "sk-my-app-key",
baseURL: "http://localhost:8080/v1",
});

// Switch providers by changing only the model
for (const model of ["gpt-4o", "claude-sonnet-4-5", "gemini-2.0-flash"]) {
const response = await client.chat.completions.create({
model,
messages: [{ role: "user", content: "Hello from Dvara!" }],
});
console.log(`${model}: ${response.choices[0].message.content}`);
}

Java — RestClient

RestClient client = RestClient.create();

String response = client.post()
.uri("http://localhost:8080/v1/chat/completions")
.contentType(MediaType.APPLICATION_JSON)
.body("""
{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Hello from Java!"}
]
}
""")
.retrieve()
.body(String.class);

System.out.println(response);

Multi-Provider Failover Demo

Configure weighted routing with two providers, then simulate a failure:

# application.yml
gateway:
routes:
- id: failover-demo
model-pattern: "gpt*"
strategy: round-robin
providers:
- provider: openai
- provider: mock # fallback to mock if OpenAI fails
providers:
mock:
enabled: true
# Start with both providers
OPENAI_API_KEY=sk-... MOCK_PROVIDER_ENABLED=true ./mvnw -pl gateway-server spring-boot:run &

# Normal request (goes to OpenAI)
curl -s http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'

# If OpenAI is down, the gateway automatically fails over to mock

Request Tracing

Pass a custom trace ID to correlate requests across your system:

curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Trace-ID: my-request-12345" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}]
}' -i 2>/dev/null | grep X-Trace-ID
# → X-Trace-ID: my-request-12345