LLM Routing
patternDirecting requests to different language models based on task requirements, cost, or availability.
LLM routing is the practice of sending different requests to different models based on criteria like task complexity, cost, latency, or model capabilities. Instead of using one model for everything, a router selects the best model for each request.
Routing strategies include: capability-based (use GPT-4 for reasoning, Claude for analysis), cost-based (use smaller models for simple tasks), fallback chains (try the preferred model, fall back to alternatives on failure), and load balancing.
OpenRouter provides API-level routing across dozens of models. SUBCORP uses a models array for native fallback routing — if the primary model fails or returns empty, the system automatically tries the next model in the list. This ensures high availability without manual intervention.