> ## Documentation Index
> Fetch the complete documentation index at: https://docs.routor.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Chat Completions

> Full reference for POST /v1/chat/completions.

# Chat Completions

Routor is a drop-in replacement for the OpenAI Chat Completions API. Any SDK or tool that works with OpenAI works with Routor. Just change the base URL and API key.

An Anthropic-compatible endpoint is also available at `POST /v1/messages` for Claude Code and other Anthropic-format clients - it converts requests to the internal OpenAI shape, reuses the full routing/fallback/billing pipeline, and translates the response back to the Anthropic format.

```mermaid theme={null}
sequenceDiagram
    participant App
    participant Routor
    participant Provider

    App->>Routor: POST /v1/chat/completions<br/>{ model: "auto", messages: [...] }
    Note over Routor: Classify · filter · select
    Routor->>Provider: Cleaned request<br/>(routor_* fields stripped)
    Provider-->>Routor: Standard response
    Routor-->>App: Response + routor object<br/>model field = actual model used
```

**Endpoint:**

```
POST https://api.routor.ai/v1/chat/completions
```

***

## Request Format

Identical to the OpenAI Chat Completions API, with optional Routor-specific fields:

```json theme={null}
{
  "model": "auto",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user",   "content": "Explain binary search trees." }
  ],
  "max_tokens": 1024,

  // ── Routor-specific (all optional) ──────────────
  "routor_profile":       "auto",
  "routor_tier":          "STANDARD",
  "routor_tier_floor":    "LIGHT",
  "routor_tier_ceiling":  "COMPLEX",
  "routor_max_cost":      0.05,
  "routor_bfcl_min":      0.8,
  "routor_code_quality":  2,
  "routor_chat_quality":  1
}
```

***

## The `model` Field

| Value              | Behavior                                                                                                                                                 |
| ------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `"auto"`           | Routor picks the model. full automatic routing                                                                                                           |
| Any valid model ID | Routes directly to that model - **only when paired with `"routor_profile": "direct"`**. Without it, the model field is ignored and routing runs as usual |

**Always use `"auto"` unless you have a specific reason to lock to a model.** To force a specific model, set `"routor_profile": "direct"` and pass the model ID in `model`.

***

## Routor-Specific Parameters

These fields are stripped before the request is forwarded to the provider. the provider never sees them.

### `routor_profile`

```
"routor_profile": "auto" | "tier" | "direct"
```

* `"auto"` *(default)*. Routor classifies the prompt and picks the tier
* `"tier"`. Use with `routor_tier` to skip classification and force a specific tier
* `"direct"`. Bypass routing entirely and proxy straight to the model named in `model`. The `model` field must be a valid model ID from `GET /v1/models`

***

### `routor_tier`

```
"routor_tier": "NANO" | "SIMPLE" | "LIGHT" | "STANDARD" | "COMPLEX"
```

Only used when `routor_profile: "tier"`. Forces routing to this exact tier, bypassing classification.

**Example. always use STANDARD regardless of prompt:**

```json theme={null}
{
  "model": "auto",
  "routor_profile": "tier",
  "routor_tier": "STANDARD",
  "messages": [...]
}
```

***

### `routor_tier_floor`

```
"routor_tier_floor": "NANO" | "SIMPLE" | "LIGHT" | "STANDARD" | "COMPLEX"
```

Sets the minimum tier. Any request classified below this floor is bumped up to the floor.

**Example. never go below STANDARD:**

```json theme={null}
{
  "model": "auto",
  "routor_tier_floor": "STANDARD",
  "messages": [...]
}
```

***

### `routor_tier_ceiling`

```
"routor_tier_ceiling": "NANO" | "SIMPLE" | "LIGHT" | "STANDARD" | "COMPLEX"
```

Sets the maximum tier. Any request classified above this ceiling is capped at the ceiling.

**Example. cap at LIGHT to limit cost:**

```json theme={null}
{
  "model": "auto",
  "routor_tier_ceiling": "LIGHT",
  "messages": [...]
}
```

***

### `routor_max_cost`

```
"routor_max_cost": <number>
```

Sets a maximum estimated cost per request in USD. Models whose estimated cost exceeds this are filtered out of the candidate chain. Falls back to the full chain if no model qualifies.

***

### `routor_bfcl_min`

```
"routor_bfcl_min": <number>
```

Minimum BFCL (function-calling) score a model must meet to stay in the candidate chain. Only meaningful when your request includes tools. Falls back to the full chain if no model qualifies.

***

### `routor_code_quality` / `routor_chat_quality`

```
"routor_code_quality": 0 | 1 | 2
"routor_chat_quality": 0 | 1 | 2
```

Quality sliders that map to a tier floor. `0` = Fast (no floor), `1` = Balanced, `2` = Best.

* `routor_code_quality` applies when the prompt looks like a coding task (floors: STANDARD for `1`, COMPLEX for `2`)
* `routor_chat_quality` applies otherwise (floors: LIGHT for `1`, STANDARD for `2`)

The slider floor is merged with any explicit `routor_tier_floor` - the higher of the two wins.

***

## Response Format

Identical to the OpenAI Chat Completions response, plus a `routor` object with routing metadata:

```json theme={null}
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1748700000,
  "model": "moonshot/kimi-k2.6",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "A binary search tree is a data structure where..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 142,
    "total_tokens": 170
  },
  "routor": {
    "model":       "moonshot/kimi-k2.6",
    "tier":        "LIGHT",
    "profile":     "auto",
    "confidence":  0.82,
    "savingsPct":  87.4,
    "method":      "rules"
  }
}
```

Note: the `model` field in the response shows the **actual model used**, not `"auto"`. The `routor` object's `model` may differ from the top-level `model` when a fallback was used.

***

## Response Headers

Every response includes `X-Request-Id`. When the server has `DEBUG_ROUTING=1` set, routing metadata is also exposed in headers:

```
X-Request-Id:           req_01abc123
X-Routor-Model:         moonshot/kimi-k2.6
X-Routor-Tier:          LIGHT
X-Routor-Confidence:    0.82
X-Routor-Profile:       auto
X-Routor-Savings:       87.4%
X-Routor-Fallback:      false
```

Without that flag, use the `routor` object in the response body (shown above) instead. See [Response Metadata](../response-headers.md) for the full reference.

***

## Streaming

Streaming works exactly like OpenAI streaming:

```typescript theme={null}
const stream = await client.chat.completions.create({
  model:    "auto",
  messages: [{ role: "user", content: "Write a short story about a robot." }],
  stream:   true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}
```

***

## Error Handling

Routor returns standard HTTP errors:

| Status | Meaning                                    |
| ------ | ------------------------------------------ |
| `400`  | Bad request. missing or invalid fields     |
| `401`  | Invalid or missing API key                 |
| `402`  | Insufficient credits                       |
| `429`  | Rate limit exceeded                        |
| `503`  | All providers in the fallback chain failed |

```json theme={null}
{
  "error": {
    "message": "Insufficient credits. Please top up your account.",
    "type":    "billing_error"
  }
}
```
