OpenToken — LLM cost infrastructure

모델

시스템

사용자 메시지

Temperature0.0 – 2.0

스트리밍스트리밍켜짐 = 토큰 단위 SSE, 꺼짐 = 전체 응답을 한 번에

시스템 프롬프트 캐싱꺼짐시스템 프롬프트를 캐싱(Gemini CachedContent)하여 반복 호출 시 저렴하게 재사용합니다. 큰 시스템 프롬프트(약 4k 토큰 이상)에서만 적용됩니다.

프롬프트를 실행하면 실시간 응답이 표시됩니다.

bash

curl https://api.opentoken.kr/v1/chat/completions \
  -H "Authorization: Bearer $OPENTOKEN_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-haiku-4-5",
    "stream": true,
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Say the single word: pong"}
    ],
    "temperature": 0.7
  }'