API Documentation

Drop-in OpenAI-, Anthropic- and Gemini-compatible endpoints for every model on your account. Bring your favorite SDK.

List models

GET/v1/models

Returns the models you can call from this account, formatted as OpenAI's model list with a pricing extension showing what you'll be billed per request.

curl https://YOUR-HOST/v1/models \
  -H "Authorization: Bearer afk-YOUR_KEY"

Chat completions

POST/v1/chat/completions

Drop-in OpenAI chat-completions surface. Standard request/response shape — your existing OpenAI SDK works unmodified once you point it at our base URL.

curl https://YOUR-HOST/v1/chat/completions \
  -H "Authorization: Bearer afk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "hi"}]
  }'

Responses

POST/v1/responses

OpenAI's newer Responses API (input/output_text shape). Supports the same models as chat completions.

curl https://YOUR-HOST/v1/responses \
  -H "Authorization: Bearer afk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "gpt-4o-mini", "input": "hi" }'

Anthropic messages

POST/v1/messages

Anthropic-native messages endpoint. Forward your existing claude/anthropic SDK requests unchanged.

curl https://YOUR-HOST/v1/messages \
  -H "x-api-key: afk-YOUR_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "hi"}]
  }'

Gemini generate-content

POST/v1beta/models/<model>:generateContent

Google Gemini's generative-language endpoint. Authenticate via x-goog-api-key.

curl "https://YOUR-HOST/v1beta/models/gemini-2.0-flash-exp:generateContent" \
  -H "x-goog-api-key: afk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "contents": [{ "role": "user", "parts": [{ "text": "hi" }] }] }'

Billing

Each request is billed against your credit balance. The price for every model is shown in the Models tab.

Streaming is supported on every endpoint — send `stream: true` (or call Gemini's `:streamGenerateContent`) and the response comes back as SSE.

  • Per-model per-call cap available; set it from the Models tab.
  • Your current balance is on the dashboard overview.
  • Calls are billed based on token usage.