Iter Iter

Multi-Model Routing

Route different tasks to different models based on capability. Use your fastest model for code generation, your most capable for reviews, and your cheapest for validation.

Model routing configuration

How routing works

1

Define capabilities

Each model is tagged with capabilities: coder, reviewer, architect, planner, explainer, liaison.

2

Configure routing

Map capabilities to specific host + model pairs in models.yaml.

3

Automatic resolution

When the pipeline needs a model, it resolves the best match based on capability, quality score, and availability.

4

Fallback chains

If the primary model is unavailable, routing falls through to configured alternatives.

Configuration

yaml
routing:
  overrides:
    coder:
      host: judy
      model: qwen3-coder
    reviewer:
      host: judy
      model: qwen3-coder
    architect:
      host: judy
      model: qwen3-coder
    planner:
      host: george
      model: deepseek-r1

  fallbacks:
    - qwen3-coder
    - llama3.1

  host_fallbacks:
    - judy
    - george
    - elroy

Multi-host support

🖥️

Hosts Server

Centralized discovery of all Ollama hosts and their available models.

🔀

Smart Routing

Capability-based routing with quality scores and automatic failover.

📊

Dashboard View

Visual routing configuration and model availability on the Roles page.

Run your own model fleet