OpenToken Docs

OpenAI compatibility

OpenToken implements the OpenAI Chat Completions API, so existing OpenAI SDK code works after swapping the base URL and key.

OpenToken speaks the OpenAI Chat Completions wire format. If your code already talks to OpenAI, you point the OpenAI SDK at OpenToken by changing two things — the base URL and the API key — and keep the rest of your call sites unchanged.

Swap the base URL and key

Set the base URL to https://api.opentoken.kr/v1 and pass an OpenToken key (these start with sk-optk-).

from openai import OpenAI

client = OpenAI(
    base_url="https://api.opentoken.kr/v1",
    api_key="$OPENTOKEN_API_KEY",
)

resp = client.chat.completions.create(
    model="google/gemini-3-flash",
    messages=[{"role": "user", "content": "Hello"}],
)
print(resp.choices[0].message.content)
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.opentoken.kr/v1",
  apiKey: process.env.OPENTOKEN_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "google/gemini-3-flash",
  messages: [{ role: "user", content: "Hello" }],
});
console.log(resp.choices[0].message.content);

What works

  • POST /v1/chat/completions with the OpenAI shape — system, user, assistant, and tool message roles.
  • Non-streaming responses (chat.completion) and streaming with stream: true, which returns Server-Sent Events terminated by data: [DONE].
  • The OpenAI sampling parameters: temperature, top_p, top_k, max_tokens (alias max_completion_tokens), stop, presence_penalty, frequency_penalty, seed, and more. Support is per-model — each adapter forwards only the params its upstream accepts and silently drops the rest (e.g. Claude Opus 4.7+ drops temperature/top_p/top_k; Gemini 3.x drops presence_penalty/frequency_penalty/seed). See /docs/guides/sampling-parameters.
  • tools / tool calling are supported on Google (google/*) models — translated to the provider's native function calling — but rejected by Anthropic models, which return an unsupported_parameter (400) error; use response_format for structured output there. response_format (json_object or json_schema) is supported across providers (Anthropic maps json_schema to constrained decoding; json_object is best-effort).
  • GET /v1/models for listing the registered models, and POST /v1/embeddings for the google/text-embedding-004 embeddings model.
  • The official OpenAI SDK in any language, plus any OpenAI-compatible HTTP client.

What differs

  • Model ids are {provider}/{model}. Use the namespaced id (for example google/gemini-2.5-pro), not a bare OpenAI model name. See /docs/models.
  • Google and Anthropic models are registered today — for example google/gemini-2.5-pro, google/gemini-3-flash, google/gemini-3.1-pro, google/gemini-3.1-flash-lite, anthropic/claude-opus-4-7, anthropic/claude-sonnet-4-5, and anthropic/claude-haiku-4-5. OpenAI (openai/*) ids are not registered and return a model_not_found error. See /docs/models for the full list.
  • Embeddings are supported via POST /v1/embeddings for the google/text-embedding-004 model. Calling /v1/embeddings with a model whose provider does not support embeddings returns an embeddings_unsupported (400) error.
  • Some superset parameters are silently dropped per provider. OpenToken accepts the OpenRouter sampling superset, but each adapter forwards only what its upstream supports. For example, google/gemini-2.5-pro accepts frequency_penalty, presence_penalty, and seed, while google/gemini-3.1-pro rejects them, so they are dropped.

Errors use the OpenAI-compatible envelope { "error": { "message", "type", "code" } }, so existing OpenAI error handling continues to work.

Next steps