Structured outputs

Constrain model output to JSON with response_format, translated into each provider's native structured-output mechanism.

Use the response_format field on POST /v1/chat/completions to make a model return machine-parseable JSON instead of free text. OpenToken translates it into each provider's native structured-output mechanism (Gemini responseMimeType/responseSchema; Anthropic output_format), so the response body is valid JSON you can parse directly.

Modes

response_format accepts two shapes:

Prop

Type

With json_object, describe the desired shape in your prompt and the model returns syntactically valid JSON. With json_schema, the model is constrained to the schema you supply via responseSchema, which is the more reliable option.

json_object is enforced on Gemini models (JSON mode) but is best-effort and prompt-only on Claude models, which only constrain output for json_schema. Prefer json_schema for guaranteed JSON across all providers.

Example

The following request asks google/gemini-2.5-pro to extract fields against a schema.

curl https://api.opentoken.kr/v1/chat/completions \
  -H "Authorization: Bearer $OPENTOKEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-2.5-pro",
    "messages": [
      { "role": "user", "content": "Extract: Ada Lovelace, born 1815 in London." }
    ],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "person",
        "schema": {
          "type": "object",
          "properties": {
            "name": { "type": "string" },
            "birth_year": { "type": "integer" },
            "birth_city": { "type": "string" }
          },
          "required": ["name", "birth_year", "birth_city"]
        }
      }
    }
  }'

from openai import OpenAI
import os, json

client = OpenAI(
    base_url="https://api.opentoken.kr/v1",
    api_key=os.environ["OPENTOKEN_API_KEY"],
)

resp = client.chat.completions.create(
    model="google/gemini-2.5-pro",
    messages=[
        {"role": "user", "content": "Extract: Ada Lovelace, born 1815 in London."},
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "person",
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "birth_year": {"type": "integer"},
                    "birth_city": {"type": "string"},
                },
                "required": ["name", "birth_year", "birth_city"],
            },
        },
    },
)

print(json.loads(resp.choices[0].message.content))

The assistant message content is a JSON string that matches the schema:

{
  "name": "Ada Lovelace",
  "birth_year": 1815,
  "birth_city": "London"
}

The content is always a JSON string, so parse it before use (json.loads in Python, JSON.parse in TypeScript).

Notes

All registered Gemini models (gemini-2.5-pro, gemini-3-flash, gemini-3.1-pro, gemini-3.1-flash-lite) and all registered Claude models (claude-opus-4-7/4-6, claude-sonnet-4-6/4-5, claude-haiku-4-5) support json_schema structured outputs. See /docs/models.
Embeddings are a separate endpoint: POST /v1/embeddings (model google/text-embedding-004). Structured outputs apply only to chat completions. See /docs/models.
Prefer json_schema over json_object when the downstream code depends on exact field names and types.

Modes

Example

Notes

Next steps

Create chat completion

Reasoning

Models

On this page