Request parameters

POST /v1/chat/completions accepts the OpenAI Chat Completions body plus the OpenRouter sampling superset. The fields below are the most commonly used ones. Additional OpenRouter/OpenAI fields (top_a, min_p, repetition_penalty, logit_bias, logprobs, top_logprobs, verbosity, user) are also parsed but accepted-and-dropped on the current Google and Anthropic models. See /docs/guides/sampling-parameters for the full set.

Each adapter forwards only the parameters its upstream supports and silently drops the rest. For example Gemini 2.5 accepts frequency_penalty, presence_penalty, and seed, while Gemini 3 rejects them so they are dropped. Most unsupported sampling parameters are dropped silently rather than erroring — but tools/tool_choice are an exception: they are translated natively on Google models and rejected with a 400 (unsupported_parameter) on Anthropic models, not silently dropped. See /docs/guides/structured-outputs.

Parameters

Prop

Type

Notes

reasoning and reasoning_effort map to the upstream thinking controls (Gemini 3 thinkingLevel, Gemini 2.5 clamped thinkingBudget). The deprecated include_reasoning flag is still accepted. See /docs/guides/reasoning.
service_tier: "flex" is Vertex-only and GLOBAL-endpoint-only. It is silently dropped on regional Vertex and on the plain API-key path. See /docs/guides/service-tiers.
Prompt caching is driven by a cache_control breakpoint on the system message rather than a top-level parameter. See /docs/guides/prompt-caching.
Embeddings use a separate endpoint, POST /v1/embeddings, not this body. The chat parameters here apply only to POST /v1/chat/completions. See also GET /v1/models.

Example request

from openai import OpenAI

client = OpenAI(
    base_url="https://api.opentoken.kr/v1",
    api_key=OPENTOKEN_API_KEY,
)

resp = client.chat.completions.create(
    model="google/gemini-3-flash",
    messages=[{"role": "user", "content": "Summarize the request body."}],
    temperature=0.7,
    max_tokens=256,
    reasoning_effort="low",
)
print(resp.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.opentoken.kr/v1",
  apiKey: process.env.OPENTOKEN_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "google/gemini-2.5-pro",
  messages: [{ role: "user", content: "Summarize the request body." }],
  temperature: 0.7,
  max_tokens: 256,
  reasoning_effort: "low",
});
console.log(resp.choices[0].message.content);

curl https://api.opentoken.kr/v1/chat/completions \
  -H "Authorization: Bearer $OPENTOKEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-3.1-pro",
    "messages": [{"role": "user", "content": "Summarize the request body."}],
    "temperature": 0.7,
    "max_tokens": 256,
    "reasoning_effort": "low"
  }'

Parameters

Notes

Example request

Create chat completion

Reasoning

Service tiers

Models

On this page

Request parameters

Parameters

Notes

Example request

Related

Create chat completion

Reasoning

Service tiers

Models

On this page