> ## Documentation Index
> Fetch the complete documentation index at: https://docs.routor.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Architecture and Vision

> What Routor is today and where it is going.

# Architecture & Vision

Routor is an intelligent routing layer that sits between your app and every AI provider. This page covers what it does today and where it is going.

<img src="https://mintcdn.com/routor/cIATklos4kQp2x0I/images/architecture-hero.svg?fit=max&auto=format&n=cIATklos4kQp2x0I&q=85&s=93b18a9a73b8c822cef9261c52a5795d" alt="Routor system architecture" width="1000" height="340" data-path="images/architecture-hero.svg" />

***

## What Routor Is Today

Routor is a proxy that makes one decision per request: which model should handle this. It does this in a few milliseconds, without any extra API call, and without your app knowing or caring.

```mermaid theme={null}
graph LR
    App["Your App"] -->|"model: auto"| R["Routor"]
    R --> OAI["OpenAI"]
    R --> ANT["Anthropic"]
    R --> GOO["Google"]
    R --> DS["DeepSeek"]
    R --> MORE["+ 9 more"]

    style R fill:#0A1408,stroke:#16A34A,color:#4ADE80
```

Every request goes in. The right model gets it. Your app gets back a standard OpenAI-compatible response.

***

## What It Handles Automatically

**Routing** - Routor classifies each prompt by task category and difficulty, then picks the best-value model that can handle it - balancing quality and cost. Simple requests go to fast, cheap models. Complex tasks go to stronger ones.

**Failover** - If the selected provider fails, Routor silently retries the next model in the chain. Your app never sees the error.

**Capability matching** - If your request includes images, files, or tool calls, only models that support those capabilities are considered.

**Provider health** - Routor validates all connected providers on startup and monitors them continuously. Broken or rate-limited providers are removed from routing automatically.

***

## The 13 Providers

All routing happens behind a single endpoint: `api.routor.ai/v1`. You never call providers directly.

| Provider  | What it brings                                                             |
| --------- | -------------------------------------------------------------------------- |
| OpenAI    | GPT-5.4, GPT-5.5 - strong general and coding                               |
| Anthropic | Claude Opus 4.8, Sonnet 4.6, Haiku 4.5 - best for long context and writing |
| Google    | Gemini 3.5 Flash, 3.1 Pro - fast and cheap at scale                        |
| DeepSeek  | V4 Flash, V4 Pro - best reasoning-per-dollar                               |
| xAI       | Grok 4 - strong general model                                              |
| NVIDIA    | Nemotron 3 Ultra - fast open-weight inference                              |
| Moonshot  | Kimi K2.6 - long context specialist                                        |
| MiniMax   | MiniMax M3 - bilingual, strong value                                       |
| Z.AI      | GLM-5.2 - open-weight, frontier-competitive                                |
| Alibaba   | Qwen3.7-Max - strong multilingual                                          |
| Mistral   | Mistral Large 3, Small 4 - strong European-hosted option                   |
| Microsoft | Phi-4, Mai-Code-1-Flash - compliance-friendly routing                      |
| Xiaomi    | MiMo-V2.5-Pro - efficient bilingual reasoning                              |

***

## Vision - Where Routor Is Going

Routor starts as a smart router. The goal is to become the default AI infrastructure layer for any team shipping AI-powered products.

The pace of model releases is accelerating. In 2026 alone, Anthropic released Opus 4.5, 4.6, and 4.8 within months of each other. OpenAI moved from GPT-4o to GPT-5.5, then paced the GPT-5.6 rollout amid outside pressure. DeepSeek's V4 and MiniMax's M3 both landed at a fraction of frontier pricing while matching frontier benchmarks. Every release changes the optimal routing decision for every tier, and on top of that, providers are now rationing capacity and have had reliability incidents of their own.

No developer can track that manually, and no single provider is a safe bet to build on alone. Routor should be the layer that absorbs both problems.

```mermaid theme={null}
graph TB
    subgraph Now["Now · Smart Router"]
        A1["Fast rule-based routing - no AI overhead"]
        A2["13 providers · Auto-failover · Health monitoring"]
        A3["Dashboard · Billing · Playground · Profiles"]
    end

    subgraph Next["Next · Data Layer"]
        B1["Full conversation viewer with routing history"]
        B2["Verified savings and per-model cost analytics"]
        B3["Live model benchmark comparison page"]
    end

    subgraph Later["Later · Scale and Teams"]
        C1["Team accounts · SSO · Per-member API keys"]
        C2["Audit logs · Usage alerts · Budget caps"]
        C3["Private deployment for enterprise"]
    end

    subgraph Future["Future · Self-Improving Layer"]
        D1["Routing accuracy improves from real traffic patterns"]
        D2["New models auto-benchmarked and added on release"]
        D3["Per-workload routing profiles that optimize themselves"]
        D4["Cost forecasting based on your usage history"]
    end

    Now --> Next --> Later --> Future

    style Now fill:#1a2a1a,stroke:#34D399,color:#34D399
    style Next fill:#1a1a2a,stroke:#60A5FA,color:#60A5FA
    style Later fill:#2a2a1a,stroke:#F59E0B,color:#F59E0B
    style Future fill:#2a1a1a,stroke:#F87171,color:#F87171
```

### The self-improving routing layer

The current routing engine is fast and deterministic. The future version learns.

When a new model drops - the next DeepSeek, the next Llama, the next Gemini Flash - Routor should automatically benchmark it against your actual traffic patterns, slot it into the right tier, and start routing to it if it outperforms what was there before. Zero configuration. Zero work on your end.

When routing decisions turn out to be wrong - a LIGHT tier request that would have been better handled by STANDARD - those signals should feed back into making the next decision more accurate.

The gap between the fastest cheap model and the best frontier model is currently 200x in price and closing in quality. The routing layer that navigates that gap intelligently, without developer effort, is the most valuable piece of AI infrastructure that does not fully exist yet.
