Model routing
How OpenToken dispatches a request to a provider by its model id, and how pre-commit failover works.
OpenToken routes every request by its model id. The id uses the {provider}/{model} form, so the prefix selects which adapter handles the call and the suffix names the upstream model. There is no separate routing config — the model field is the route.
How dispatch works
When a request arrives, OpenToken parses the model id, looks up the registered adapter for the provider, and forwards the normalized request to that provider. Unknown ids never reach an upstream.
from openai import OpenAI
client = OpenAI(
base_url="https://api.opentoken.kr/v1",
api_key="OPENTOKEN_API_KEY",
)
resp = client.chat.completions.create(
model="google/gemini-3-flash",
messages=[{"role": "user", "content": "Route this to Gemini 3 Flash."}],
)
print(resp.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.opentoken.kr/v1",
apiKey: process.env.OPENTOKEN_API_KEY,
});
const resp = await client.chat.completions.create({
model: "google/gemini-2.5-pro",
messages: [{ role: "user", content: "Route this to Gemini 2.5 Pro." }],
});
console.log(resp.choices[0].message.content);curl https://api.opentoken.kr/v1/chat/completions \
-H "Authorization: Bearer $OPENTOKEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemini-3.1-pro",
"messages": [{"role": "user", "content": "Route this to Gemini 3.1 Pro."}]
}'To target a different model, change the id. OpenToken currently routes to Google Gemini and Anthropic Claude models — for example google/gemini-2.5-pro, google/gemini-3-flash, google/gemini-3.1-pro, google/gemini-3.1-flash-lite, anthropic/claude-opus-4-7, anthropic/claude-sonnet-4-5, and anthropic/claude-haiku-4-5, plus the embeddings model google/text-embedding-004. Call GET /v1/models for the authoritative live list, or see the models page for details.
An unregistered id such as openai/gpt-4o does not route anywhere — it returns a model_not_found error with status 400. Only ids in the live routing catalog (see GET /v1/models) resolve to an adapter.
Failover
OpenToken can retry a request before any response bytes are committed to your connection. If an upstream returns a retryable error, the gateway transparently fails over and you still receive a single clean response.
- Pre-commit retry. Failover happens only before the first byte of the response is sent. For a streaming request this means before the first SSE chunk is emitted.
- Retryable conditions. Genuine transient upstream failures — an
upstream_errorwith status>= 500or429, a504 upstream_timeout, or amissing_provider_keyfor the selected supplier — are eligible for failover. - Terminal errors pass through. Errors that a retry cannot fix are returned as-is:
400 model_not_foundfor an unknown id,400 invalid_request_errorfor a validation failure (including402with codeinsufficient_creditfor an exhausted balance),401 authentication_error, and503 upstream_errorwith codeno_supplierwhen supplier selection is exhausted. These are never retried.
Because failover is pre-commit, a streaming response that has already started cannot be silently re-routed. After a 200, an upstream failure surfaces over SSE as one more data: frame carrying the error envelope (followed by data: [DONE]) rather than a new attempt. See error handling for the full envelope and code list.