Automatic Model Routing
Different parts of an agent want different models. Long reasoning wants a frontier model. Quick "fix this typo" calls want a fast cheap one. Vision wants a vision model. OpenHuman handles this with a built-in router provider so you never have to think about it.
How a request gets routed
The model parameter on any chat call can take one of two shapes:
- Concrete model name. e.g.
anthropic/claude-sonnet-4. Routes to the default provider with that exact model. - Hint prefix. e.g.
hint:reasoning. Looks the hint up in the route table and resolves to a(provider, model)pair.
Common hints
| Hint | Typical target | When it's used |
|---|---|---|
hint:reasoning | A strong reasoning model | Multi-step planning, math, code-heavy turns |
hint:fast | A fast/cheap model | UI helpers, autocompletes, small classification calls |
hint:vision | A vision-capable model | Screenshots, image attachments, OCR |
hint:summarize | A model good at compression | Memory tree summary builders |
hint:code | A code-tuned model | Native coder turns |
One subscription
Routing happens behind a single OpenHuman subscription. You don't hold separate API keys for Anthropic, OpenAI, Google etc., the backend brokers access, and the router picks the right one per task.
Per-agent model pins
Sub-agents can also pin an exact model without disabling automatic routing for the rest of the app:
{
"agent_id": "researcher",
"model": "anthropic/claude-sonnet-4",
"prompt": "Collect source notes for the launch memo."
}
Why this isn't just "model switcher"
Routing isn't a UI dropdown. The agent loop itself emits hints based on what it's about to do. You don't pick the model; the task does. That's the difference between "multi-model" and "smart routing".
See also
- Smart Token Compression - What makes large reasoning calls affordable.
- Native Tools - Different tool calls hint at different routes.
- Local AI - Lightweight chat hints can run on-device.