Quick Start
How failover activates during retries
Failover now drives LLM retries through direct integration with the retry mechanism:- On every LLM call, the system first gets the current profile via
get_next_profile()and applies itsapi_key,base_url, andmodelsettings - On success,
mark_success(profile)is called to track the working provider - On failure,
mark_failure(profile, error, is_rate_limit=...)marks the provider as failed, thenget_next_profile()fetches the next available provider - Profile switching overrides non-retryable classification—one extra attempt is always granted after switching providers
- The LLM automatically updates request parameters (api_key, base_url, model) when switching between profiles
How It Works
| Component | Role |
|---|---|
| AuthProfile | Credentials for a single provider |
| FailoverManager | Orchestrates failover logic |
| FailoverConfig | Retry and backoff settings |
| ProviderStatus | Tracks provider health |
Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
max_retries | int | 3 | Maximum retry attempts |
retry_delay | float | 1.0 | Initial retry delay |
exponential_backoff | bool | True | Use exponential backoff |
max_retry_delay | float | 60.0 | Maximum retry delay |
cooldown_on_rate_limit | float | 60.0 | Rate limit cooldown (seconds) |
cooldown_on_error | float | 30.0 | Error cooldown (seconds) |
rotate_on_success | bool | False | Rotate profiles on success |
Auth Profiles
Configure credentials for each provider:| Field | Type | Description |
|---|---|---|
name | str | Unique profile identifier |
provider | str | Provider: openai, anthropic, etc. |
api_key | str | API key (masked in logs) |
base_url | str | Custom API endpoint |
model | str | Default model for this profile |
priority | int | Failover priority (lower = higher priority) |
rate_limit_rpm | int | Requests per minute limit |
rate_limit_tpm | int | Tokens per minute limit |
metadata | dict | Additional provider-specific config |
Common Patterns
- Multi-Provider
- Cost Optimization
- Regional Failover
Failover Callbacks
React to failover events:Provider Status
Monitor provider health:Best Practices
Configure multiple providers
Configure multiple providers
Always have at least 2-3 providers configured. This ensures availability even during major outages.
Use exponential backoff
Use exponential backoff
Enable
exponential_backoff=True to avoid hammering providers during issues. This helps you stay within rate limits.Set appropriate priorities
Set appropriate priorities
Order providers by cost and reliability. Put cheaper/faster providers first, with premium providers as fallback.
Monitor failover events
Monitor failover events
Use the
on_failover callback to track when failovers occur. This helps identify provider issues early.Related
Tool Circuit Breaker
Automatic tool failure protection
Models
Supported LLM providers

