Chat Completions

Routor is a drop-in replacement for the OpenAI Chat Completions API. Any SDK or tool that works with OpenAI works with Routor. Just change the base URL and API key. An Anthropic-compatible endpoint is also available at POST /v1/messages for Claude Code and other Anthropic-format clients - it converts requests to the internal OpenAI shape, reuses the full routing/fallback/billing pipeline, and translates the response back to the Anthropic format. Endpoint:

POST https://api.routor.ai/v1/chat/completions

Request Format

Identical to the OpenAI Chat Completions API, with optional Routor-specific fields:

{
  "model": "auto",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user",   "content": "Explain binary search trees." }
  ],
  "max_tokens": 1024,

  // ── Routor-specific (all optional) ──────────────
  "routor_profile":       "auto",
  "routor_tier":          "STANDARD",
  "routor_tier_floor":    "LIGHT",
  "routor_tier_ceiling":  "COMPLEX",
  "routor_max_cost":      0.05,
  "routor_bfcl_min":      0.8,
  "routor_code_quality":  2,
  "routor_chat_quality":  1
}

The `model` Field

Value	Behavior
`"auto"`	Routor picks the model. full automatic routing
Any valid model ID	Routes directly to that model - only when paired with `"routor_profile": "direct"`. Without it, the model field is ignored and routing runs as usual

Always use "auto" unless you have a specific reason to lock to a model. To force a specific model, set "routor_profile": "direct" and pass the model ID in model.

Routor-Specific Parameters

These fields are stripped before the request is forwarded to the provider. the provider never sees them.

`routor_profile`

"routor_profile": "auto" | "tier" | "direct"

"auto" (default). Routor classifies the prompt and picks the tier
"tier". Use with routor_tier to skip classification and force a specific tier
"direct". Bypass routing entirely and proxy straight to the model named in model. The model field must be a valid model ID from GET /v1/models

`routor_tier`

"routor_tier": "NANO" | "SIMPLE" | "LIGHT" | "STANDARD" | "COMPLEX"

Only used when routor_profile: "tier". Forces routing to this exact tier, bypassing classification. Example. always use STANDARD regardless of prompt:

{
  "model": "auto",
  "routor_profile": "tier",
  "routor_tier": "STANDARD",
  "messages": [...]
}

`routor_tier_floor`

"routor_tier_floor": "NANO" | "SIMPLE" | "LIGHT" | "STANDARD" | "COMPLEX"

Sets the minimum tier. Any request classified below this floor is bumped up to the floor. Example. never go below STANDARD:

{
  "model": "auto",
  "routor_tier_floor": "STANDARD",
  "messages": [...]
}

`routor_tier_ceiling`

"routor_tier_ceiling": "NANO" | "SIMPLE" | "LIGHT" | "STANDARD" | "COMPLEX"

Sets the maximum tier. Any request classified above this ceiling is capped at the ceiling. Example. cap at LIGHT to limit cost:

{
  "model": "auto",
  "routor_tier_ceiling": "LIGHT",
  "messages": [...]
}

`routor_max_cost`

"routor_max_cost": <number>

Sets a maximum estimated cost per request in USD. Models whose estimated cost exceeds this are filtered out of the candidate chain. Falls back to the full chain if no model qualifies.

`routor_bfcl_min`

"routor_bfcl_min": <number>

Minimum BFCL (function-calling) score a model must meet to stay in the candidate chain. Only meaningful when your request includes tools. Falls back to the full chain if no model qualifies.

`routor_code_quality` / `routor_chat_quality`

"routor_code_quality": 0 | 1 | 2
"routor_chat_quality": 0 | 1 | 2

Quality sliders that map to a tier floor. 0 = Fast (no floor), 1 = Balanced, 2 = Best.

routor_code_quality applies when the prompt looks like a coding task (floors: STANDARD for 1, COMPLEX for 2)
routor_chat_quality applies otherwise (floors: LIGHT for 1, STANDARD for 2)

The slider floor is merged with any explicit routor_tier_floor - the higher of the two wins.

Response Format

Identical to the OpenAI Chat Completions response, plus a routor object with routing metadata:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1748700000,
  "model": "moonshot/kimi-k2.6",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "A binary search tree is a data structure where..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 142,
    "total_tokens": 170
  },
  "routor": {
    "model":       "moonshot/kimi-k2.6",
    "tier":        "LIGHT",
    "profile":     "auto",
    "confidence":  0.82,
    "savingsPct":  87.4,
    "method":      "rules"
  }
}

Note: the model field in the response shows the actual model used, not "auto". The routor object’s model may differ from the top-level model when a fallback was used.

Response Headers

Every response includes X-Request-Id. When the server has DEBUG_ROUTING=1 set, routing metadata is also exposed in headers:

X-Request-Id:           req_01abc123
X-Routor-Model:         moonshot/kimi-k2.6
X-Routor-Tier:          LIGHT
X-Routor-Confidence:    0.82
X-Routor-Profile:       auto
X-Routor-Savings:       87.4%
X-Routor-Fallback:      false

Without that flag, use the routor object in the response body (shown above) instead. See Response Metadata for the full reference.

Streaming

Streaming works exactly like OpenAI streaming:

const stream = await client.chat.completions.create({
  model:    "auto",
  messages: [{ role: "user", content: "Write a short story about a robot." }],
  stream:   true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Error Handling

Routor returns standard HTTP errors:

Status	Meaning
`400`	Bad request. missing or invalid fields
`401`	Invalid or missing API key
`402`	Insufficient credits
`429`	Rate limit exceeded
`503`	All providers in the fallback chain failed

{
  "error": {
    "message": "Insufficient credits. Please top up your account.",
    "type":    "billing_error"
  }
}

​Chat Completions

​Request Format

​The model Field

​Routor-Specific Parameters

​routor_profile

​routor_tier

​routor_tier_floor

​routor_tier_ceiling

​routor_max_cost

​routor_bfcl_min

​routor_code_quality / routor_chat_quality

​Response Format

​Response Headers

​Streaming

​Error Handling