Request parameters
The full parameter superset accepted by POST /v1/chat/completions.
POST /v1/chat/completions accepts the OpenAI Chat Completions body plus the OpenRouter sampling superset. The fields below are the most commonly used ones. Additional OpenRouter/OpenAI fields (top_a, min_p, repetition_penalty, logit_bias, logprobs, top_logprobs, verbosity, user) are also parsed but accepted-and-dropped on the current Google and Anthropic models. See /docs/guides/sampling-parameters for the full set.
Each adapter forwards only the parameters its upstream supports and silently drops the rest. For example Gemini 2.5 accepts frequency_penalty, presence_penalty, and seed, while Gemini 3 rejects them so they are dropped. Most unsupported sampling parameters are dropped silently rather than erroring — but tools/tool_choice are an exception: they are translated natively on Google models and rejected with a 400 (unsupported_parameter) on Anthropic models, not silently dropped. See /docs/guides/structured-outputs.
Parameters
Prop
Type
Notes
reasoningandreasoning_effortmap to the upstream thinking controls (Gemini 3thinkingLevel, Gemini 2.5 clampedthinkingBudget). The deprecatedinclude_reasoningflag is still accepted. See /docs/guides/reasoning.service_tier: "flex"is Vertex-only and GLOBAL-endpoint-only. It is silently dropped on regional Vertex and on the plain API-key path. See /docs/guides/service-tiers.- Prompt caching is driven by a
cache_controlbreakpoint on the system message rather than a top-level parameter. See /docs/guides/prompt-caching. - Embeddings use a separate endpoint,
POST /v1/embeddings, not this body. The chat parameters here apply only toPOST /v1/chat/completions. See alsoGET /v1/models.
Example request
from openai import OpenAI
client = OpenAI(
base_url="https://api.opentoken.kr/v1",
api_key=OPENTOKEN_API_KEY,
)
resp = client.chat.completions.create(
model="google/gemini-3-flash",
messages=[{"role": "user", "content": "Summarize the request body."}],
temperature=0.7,
max_tokens=256,
reasoning_effort="low",
)
print(resp.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.opentoken.kr/v1",
apiKey: process.env.OPENTOKEN_API_KEY,
});
const resp = await client.chat.completions.create({
model: "google/gemini-2.5-pro",
messages: [{ role: "user", content: "Summarize the request body." }],
temperature: 0.7,
max_tokens: 256,
reasoning_effort: "low",
});
console.log(resp.choices[0].message.content);curl https://api.opentoken.kr/v1/chat/completions \
-H "Authorization: Bearer $OPENTOKEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemini-3.1-pro",
"messages": [{"role": "user", "content": "Summarize the request body."}],
"temperature": 0.7,
"max_tokens": 256,
"reasoning_effort": "low"
}'