OpenToken Docs

Models

Browse the models OpenToken routes to and learn the {provider}/{model} id format.

OpenToken exposes every model through a single OpenAI-compatible gateway. You select a model with the model field on a chat completion, using the {provider}/{model} id format — for example google/gemini-2.5-pro. See model ids for the full naming rules.

List models

GET /v1/models returns the registered models as an OpenAI-compatible list.

curl https://api.opentoken.kr/v1/models \
  -H "Authorization: Bearer $OPENTOKEN_API_KEY"
from openai import OpenAI

client = OpenAI(
    base_url="https://api.opentoken.kr/v1",
    api_key=os.environ["OPENTOKEN_API_KEY"],
)

print(client.models.list())
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.opentoken.kr/v1",
  apiKey: process.env.OPENTOKEN_API_KEY,
});

console.log(await client.models.list());

Registered models

Model idProviderInputOutputCache readCache write
anthropic/claude-haiku-4-5Anthropic$1.00$5.00$0.10$1.25
anthropic/claude-opus-4-6Anthropic$5.00$25.00$0.50$6.25
anthropic/claude-opus-4-7Anthropic$5.00$25.00$0.50$6.25
anthropic/claude-sonnet-4-5Anthropic$3.00$15.00$0.30$3.75
anthropic/claude-sonnet-4-6Anthropic$3.00$15.00$0.30$3.75
google/gemini-2.5-proGoogle$1.25$10.00$0.125$1.25
google/gemini-3-flashGoogle$0.50$3.00$0.05$0.50
google/gemini-3.1-flash-liteGoogle$0.25$1.50$0.025$0.25
google/gemini-3.1-proGoogle$2.00$12.00$0.20$2.00

Prices are USD per 1M tokens. Cache read / write apply when a request carries a cache_control breakpoint; a blank cell means the model has no distinct cache rate. The authoritative live rates are always GET /v1/models.

Embeddings

Model idProviderInput
google/text-embedding-004Google$0.025

Embeddings are billed on input tokens only (no completion tokens) via POST /v1/embeddings.

All chat models support streaming, reasoning, and prompt caching. Tool calling is supported on Google models (translated to native function calling) but rejected by Anthropic models, which return a 400 unsupported_parameter. Passing an unregistered id returns a 400 invalid_request_error with code model_not_found — for example openai/gpt-4o, a model OpenToken does not currently route.

This table is generated from the gateway's routing catalog — the same source as GET /v1/models — so it stays in sync with what is actually routable. Every billed request also writes an immutable usage record and a USD credit-ledger debit.

Capabilities

  • Streaming — set stream: true to receive Server-Sent Events. See streaming.
  • Tools — pass tools and tool_choice on a chat completion (Google models only; Anthropic models reject tools with 400 unsupported_parameter).
  • Reasoning — control thinking with reasoning: { effort, max_tokens, exclude }. See reasoning.
  • Caching — add an OpenRouter-style cache_control breakpoint on the system message to cache a large prefix. See prompt caching.
  • Embeddings — the embeddings model is served on POST /v1/embeddings (input-only billing).

Available endpoints: POST /v1/chat/completions, POST /v1/embeddings, and GET /v1/models.

Next steps