Multi-Model Routing
Route different tasks to different models based on capability. Use your fastest model for code generation, your most capable for reviews, and your cheapest for validation.
How routing works
1
Define capabilities
Each model is tagged with capabilities: coder, reviewer, architect, planner, explainer, liaison.
2
Configure routing
Map capabilities to specific host + model pairs in models.yaml.
3
Automatic resolution
When the pipeline needs a model, it resolves the best match based on capability, quality score, and availability.
4
Fallback chains
If the primary model is unavailable, routing falls through to configured alternatives.
Configuration
yaml
routing:
overrides:
coder:
host: judy
model: qwen3-coder
reviewer:
host: judy
model: qwen3-coder
architect:
host: judy
model: qwen3-coder
planner:
host: george
model: deepseek-r1
fallbacks:
- qwen3-coder
- llama3.1
host_fallbacks:
- judy
- george
- elroy Multi-host support
🖥️
Hosts Server
Centralized discovery of all Ollama hosts and their available models.
🔀
Smart Routing
Capability-based routing with quality scores and automatic failover.
📊
Dashboard View
Visual routing configuration and model availability on the Roles page.