Models

Browse the models OpenToken routes to and learn the {provider}/{model} id format.

OpenToken exposes every model through a single OpenAI-compatible gateway. You select a model with the model field on a chat completion, using the {provider}/{model} id format — for example google/gemini-2.5-pro. See model ids for the full naming rules.

List models

GET /v1/models returns the registered models as an OpenAI-compatible list.

curl https://api.opentoken.kr/v1/models \
  -H "Authorization: Bearer $OPENTOKEN_API_KEY"

from openai import OpenAI

client = OpenAI(
    base_url="https://api.opentoken.kr/v1",
    api_key=os.environ["OPENTOKEN_API_KEY"],
)

print(client.models.list())

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.opentoken.kr/v1",
  apiKey: process.env.OPENTOKEN_API_KEY,
});

console.log(await client.models.list());

Registered models

Model id	Provider	Input	Output	Cache read	Cache write
`anthropic/claude-haiku-4-5`	Anthropic	$1.00	$5.00	$0.10	$1.25
`anthropic/claude-opus-4-6`	Anthropic	$5.00	$25.00	$0.50	$6.25
`anthropic/claude-opus-4-7`	Anthropic	$5.00	$25.00	$0.50	$6.25
`anthropic/claude-sonnet-4-5`	Anthropic	$3.00	$15.00	$0.30	$3.75
`anthropic/claude-sonnet-4-6`	Anthropic	$3.00	$15.00	$0.30	$3.75
`google/gemini-2.5-pro`	Google	$1.25	$10.00	$0.125	$1.25
`google/gemini-3-flash`	Google	$0.50	$3.00	$0.05	$0.50
`google/gemini-3.1-flash-lite`	Google	$0.25	$1.50	$0.025	$0.25
`google/gemini-3.1-pro`	Google	$2.00	$12.00	$0.20	$2.00

Prices are USD per 1M tokens. Cache read / write apply when a request carries a cache_control breakpoint; a blank cell means the model has no distinct cache rate. The authoritative live rates are always GET /v1/models.

Embeddings

Model id	Provider	Input
`google/text-embedding-004`	Google	$0.025

Embeddings are billed on input tokens only (no completion tokens) via POST /v1/embeddings.

All chat models support streaming, reasoning, and prompt caching. Tool calling is supported on Google models (translated to native function calling) but rejected by Anthropic models, which return a 400 unsupported_parameter. Passing an unregistered id returns a 400 invalid_request_error with code model_not_found — for example openai/gpt-4o, a model OpenToken does not currently route.

This table is generated from the gateway's routing catalog — the same source as GET /v1/models — so it stays in sync with what is actually routable. Every billed request also writes an immutable usage record and a USD credit-ledger debit.

Capabilities

Streaming — set stream: true to receive Server-Sent Events. See streaming.
Tools — pass tools and tool_choice on a chat completion (Google models only; Anthropic models reject tools with 400 unsupported_parameter).
Reasoning — control thinking with reasoning: { effort, max_tokens, exclude }. See reasoning.
Caching — add an OpenRouter-style cache_control breakpoint on the system message to cache a large prefix. See prompt caching.
Embeddings — the embeddings model is served on POST /v1/embeddings (input-only billing).

Available endpoints: POST /v1/chat/completions, POST /v1/embeddings, and GET /v1/models.

List models

Registered models

Embeddings

Capabilities

Next steps

Model ids

Create chat completion

Reasoning

On this page