Model Fallback

pattern

Automatically switching to an alternative LLM when the primary model fails, is unavailable, or returns empty.

Model fallback is a reliability pattern where your system tries multiple models in order until one succeeds. If model A times out, returns an error, or produces empty output, the system automatically retries with model B, then model C.

This is essential for production agent systems. No single model has 100% uptime, and different models have different failure modes. A well-designed fallback chain might try: local Ollama model → Claude via OpenRouter → GPT-4 via OpenRouter → smaller fallback model.

Implementation details matter. The OpenRouter SDK supports a models array for native API-level fallback (limited to 3 models). For longer chains, client-side fallback logic tries each model individually. Empty response detection is critical — some models return 200 OK with empty content.

In the Directory