Create embeddings
OpenAI-compatible embeddings endpoint — POST /v1/embeddings.
POST /v1/embeddings
Creates an embedding vector for the input text. Returns a buffered, OpenAI-compatible embeddings object — there is no streaming for this endpoint.
The catalog currently exposes one embedding model, google/text-embedding-004 (768 dimensions by default). Send a single string or a batch of strings and receive one vector per input, in order.
Request
curl https://api.opentoken.kr/v1/embeddings \
-H "Authorization: Bearer $OPENTOKEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "google/text-embedding-004",
"input": "The quick brown fox jumps over the lazy dog."
}'import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.opentoken.kr/v1",
api_key=os.environ["OPENTOKEN_API_KEY"],
)
response = client.embeddings.create(
model="google/text-embedding-004",
input="The quick brown fox jumps over the lazy dog.",
)
print(response.data[0].embedding[:8])import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.opentoken.kr/v1",
apiKey: process.env.OPENTOKEN_API_KEY,
});
const response = await client.embeddings.create({
model: "google/text-embedding-004",
input: "The quick brown fox jumps over the lazy dog.",
});
console.log(response.data[0].embedding.slice(0, 8));Request body
Prop
Type
task_type and dimensions are OpenToken extensions to the OpenAI embeddings
surface. OpenAI has no task_type; Google embedding models use it to tune the
vector for retrieval, classification, or similarity. For a corpus-plus-query
retrieval setup, embed stored documents with RETRIEVAL_DOCUMENT and search
queries with RETRIEVAL_QUERY.
Response
The response is a list object whose data array holds one embedding entry per input, each with its zero-based index. The usage object reports input-only token accounting — embeddings produce no completion tokens, so total_tokens equals prompt_tokens.
{
"object": "list",
"model": "google/text-embedding-004",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0023064255, -0.009327292, 0.015797347, "..."]
}
],
"usage": {
"prompt_tokens": 11,
"total_tokens": 11
}
}When you pass an array of strings, data contains one entry per string with matching index values (0, 1, …) in request order.
Embeddings are billed input-only, metered at the model's input rate
(google/text-embedding-004 is $0.025 / 1M tokens). The OpenAI SDK surfaces a
completion_tokens of 0 for embedding usage; total_tokens always equals
prompt_tokens. See List models for live pricing.
Errors
The endpoint uses the standard OpenAI-compatible error envelope. Beyond the shared auth and balance errors, two cases are specific to model selection:
- A registered model whose provider has no embeddings support returns
400with codeembeddings_unsupportedand messagemodel "<id>" does not support embeddings. - An unregistered model id returns
400with codemodel_not_found.
{
"error": {
"message": "model \"anthropic/claude-sonnet-4-6\" does not support embeddings",
"type": "invalid_request_error",
"code": "embeddings_unsupported"
}
}A missing or invalid key returns 401 with type authentication_error, and an exhausted balance returns 402 with code insufficient_credit. See Errors for the full list.