OpenToken Docs

Request parameters

The full parameter superset accepted by POST /v1/chat/completions.

POST /v1/chat/completions accepts the OpenAI Chat Completions body plus the OpenRouter sampling superset. The fields below are the most commonly used ones. Additional OpenRouter/OpenAI fields (top_a, min_p, repetition_penalty, logit_bias, logprobs, top_logprobs, verbosity, user) are also parsed but accepted-and-dropped on the current Google and Anthropic models. See /docs/guides/sampling-parameters for the full set.

Each adapter forwards only the parameters its upstream supports and silently drops the rest. For example Gemini 2.5 accepts frequency_penalty, presence_penalty, and seed, while Gemini 3 rejects them so they are dropped. Most unsupported sampling parameters are dropped silently rather than erroring — but tools/tool_choice are an exception: they are translated natively on Google models and rejected with a 400 (unsupported_parameter) on Anthropic models, not silently dropped. See /docs/guides/structured-outputs.

Parameters

Prop

Type

Notes

  • reasoning and reasoning_effort map to the upstream thinking controls (Gemini 3 thinkingLevel, Gemini 2.5 clamped thinkingBudget). The deprecated include_reasoning flag is still accepted. See /docs/guides/reasoning.
  • service_tier: "flex" is Vertex-only and GLOBAL-endpoint-only. It is silently dropped on regional Vertex and on the plain API-key path. See /docs/guides/service-tiers.
  • Prompt caching is driven by a cache_control breakpoint on the system message rather than a top-level parameter. See /docs/guides/prompt-caching.
  • Embeddings use a separate endpoint, POST /v1/embeddings, not this body. The chat parameters here apply only to POST /v1/chat/completions. See also GET /v1/models.

Example request

from openai import OpenAI

client = OpenAI(
    base_url="https://api.opentoken.kr/v1",
    api_key=OPENTOKEN_API_KEY,
)

resp = client.chat.completions.create(
    model="google/gemini-3-flash",
    messages=[{"role": "user", "content": "Summarize the request body."}],
    temperature=0.7,
    max_tokens=256,
    reasoning_effort="low",
)
print(resp.choices[0].message.content)
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.opentoken.kr/v1",
  apiKey: process.env.OPENTOKEN_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "google/gemini-2.5-pro",
  messages: [{ role: "user", content: "Summarize the request body." }],
  temperature: 0.7,
  max_tokens: 256,
  reasoning_effort: "low",
});
console.log(resp.choices[0].message.content);
curl https://api.opentoken.kr/v1/chat/completions \
  -H "Authorization: Bearer $OPENTOKEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-3.1-pro",
    "messages": [{"role": "user", "content": "Summarize the request body."}],
    "temperature": 0.7,
    "max_tokens": 256,
    "reasoning_effort": "low"
  }'