> ## Documentation Index
> Fetch the complete documentation index at: https://docs.routor.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Introduction

> What Routor is, the problem it solves, and what you get.

# Introduction

Routor sits between your app and every AI provider. You send a request, it picks the right model. You never think about which model to use again.

<img src="https://mintcdn.com/routor/cIATklos4kQp2x0I/images/intro-hero.svg?fit=max&auto=format&n=cIATklos4kQp2x0I&q=85&s=b98a7ff09a56db9cb3aa9a5a9a7f940b" alt="Routor routing overview - one API, every model, automatic routing" width="1000" height="380" data-path="images/intro-hero.svg" />

***

## The Problem

Most teams pick one model and send everything there. Casual questions, quick lookups, complex reasoning tasks. Everything hits the same expensive endpoint.

That is where the money goes.

The gap between the cheapest and most expensive capable models in 2026 is enormous:

| Model                     | Input    | Output   | Best for                     |
| ------------------------- | -------- | -------- | ---------------------------- |
| **DeepSeek V4 Flash**     | \$0.14/M | \$0.28/M | Fast, cheap tasks            |
| **GLM-4.7 Flash** (Z.AI)  | \$0.20/M | \$0.20/M | Simple to medium tasks       |
| **Gemini 3.1 Flash-Lite** | \$0.25/M | \$1.50/M | Speed-critical requests      |
| **Claude Haiku 4.5**      | \$1/M    | \$5/M    | Balanced Anthropic tasks     |
| **Claude Sonnet 4.6**     | \$3/M    | \$15/M   | Production-grade quality     |
| **Claude Opus 4.8**       | \$5/M    | \$25/M   | Most capable model available |
| **GPT-5.5**               | \$5/M    | \$30/M   | Frontier OpenAI tasks        |

Sending everything to Claude Opus 4.8 or GPT-5.5 when 60% of your requests could go to a \$0.20 model is the most common and most expensive mistake in AI development today.

And when that one provider goes down, your entire product goes down with it.

***

## The Landscape in 2026

A few things have changed that make routing more important than ever:

**Anthropic now leads in enterprise adoption.** As of April 2026, Anthropic surpassed OpenAI in business AI adoption - 34.4% vs 32.3% - driven by Claude's strength in coding, long context, and reliability. Claude Opus 4.8, released May 2026, is currently ranked the most capable AI model available. Claude Code ranked first among developer tools in a survey of 15,000 developers.

**Open-weight caught up.** GLM-5.2, Kimi K2.6, and DeepSeek V4 match or exceed earlier flagship models on most production tasks - at a fraction of the price. There is no reason to pay premium rates for the majority of your requests.

**The model count keeps growing.** More providers, more models, more pricing tiers. Picking the right one manually for every request type is not a realistic engineering task. Routing is infrastructure now, not an optimization.

**Single-provider bets are getting riskier.** Anthropic has had multi-model error-rate incidents this year, and every other major lab has had a comparable one. On top of that, providers have started throttling or pacing rollouts as demand outpaces available compute. Betting a whole product on one vendor now risks both reliability and access, not just price. It is part of why a meaningful share of developers say they deliberately spread work across more than one AI vendor instead of standardizing on one.

**Aggregators are not the same as a routing decision.** Marketplace-style products that expose hundreds of models through a single API keep raising funding, which confirms teams want one integration point. It does not mean an undifferentiated catalog solves overspending - it just moves the "which model do I use" decision onto the developer instead of answering it.

***

## What Routor Does

Routor analyzes every request in a few milliseconds and sends it to the best-value model that can handle it well - balancing quality and cost, not just picking the cheapest. If that model fails, it silently tries the next one in a ranked fallback chain.

```mermaid theme={null}
graph LR
    App["Your App\n(no changes)"] -->|"model: auto"| R["Routor\nIntelligence Layer"]
    R -->|"greeting / ack"| Nano["GLM-4.7 Flash / Gemini 3.1 Flash-Lite\n~$0.00001"]
    R -->|"simple Q&A"| Simple["Qwen3.6-27B / GLM-5.2\n~$0.0002"]
    R -->|"explanation / light code"| Light["Kimi K2.6 / DeepSeek V4 Flash\n~$0.001"]
    R -->|"detailed code / analysis"| Standard["Claude Sonnet 4.6 / GPT-5.4\n~$0.015"]
    R -->|"expert / architecture / proof"| Complex["Claude Opus 4.8 / DeepSeek V4 Pro\n~$0.025"]

    style R fill:#0F2027,stroke:#F59E0B,color:#F59E0B
    style Nano fill:#0F1F14,stroke:#34D399,color:#34D399
    style Simple fill:#0F1F14,stroke:#34D399,color:#34D399
    style Light fill:#1F1A0F,stroke:#F59E0B,color:#F59E0B
    style Standard fill:#1A1020,stroke:#F97316,color:#F97316
    style Complex fill:#20100F,stroke:#F87171,color:#F87171
```

| Request                      | Without Routor      | With Routor                       |
| ---------------------------- | ------------------- | --------------------------------- |
| "summarize this email"       | Opus 4.8 at \$0.025 | Gemini 3.1 Flash-Lite at \$0.0001 |
| "explain closures in JS"     | Opus 4.8 at \$0.030 | GLM-4.7 Flash at \$0.002          |
| "refactor this auth system"  | Opus 4.8 at \$0.080 | Claude Sonnet 4.6 at \$0.018      |
| "prove sqrt 2 is irrational" | Opus 4.8 at \$0.090 | DeepSeek V4 Pro at \$0.004        |

Savings are measured against a baseline of routing everything to Claude Opus 4.8 - the common "default to the strongest model" pattern. You change one line of code. Everything else is automatic.

***

## What You Get

* **25 to 70% lower costs** on real workloads
* **No downtime** with automatic failover across 13 providers
* **OpenAI-compatible API** - nothing in your codebase breaks. An Anthropic-compatible endpoint is also available for Claude Code and other Anthropic-format clients
* **Dashboard** with live savings, usage, and provider health
* **Playground** to test routing decisions before they hit production
* **13 providers and 32 models** all behind one endpoint

***

## Where to Start

<CardGroup cols={2}>
  <Card title="Migrate in 5 minutes" icon="bolt" href="quickstart">
    Change three lines of code - API key, base URL, model name. Your app routes intelligently immediately.
  </Card>

  <Card title="Understand routing" icon="route" href="how-it-works">
    See how the 15-dimensional scorer classifies requests into 17 task categories and 5 difficulty tiers, then picks the best-value model.
  </Card>

  <Card title="Tune behavior" icon="sliders" href="playground/create-profile">
    Create a routing profile with a custom tier range, cost cap, and capability requirements.
  </Card>

  <Card title="Browse the API" icon="code" href="api/chat-completions">
    Full reference for `/v1/chat/completions`, streaming, routing parameters, and the response format.
  </Card>

  <Card title="See real savings" icon="chart-bar" href="benchmarks">
    Cost, latency, and quality benchmarks across real workload types.
  </Card>

  <Card title="Use case guides" icon="book-open" href="use-cases/chatbot">
    Step-by-step configs for chatbots, code assistants, and customer support automation.
  </Card>
</CardGroup>