Celery Production Best Practices
Upgrade guardrails, monitoring tips, and regression tests distilled from real incident playbooks.
Recommended guardrails
- Add regression tests that replay the minimal reproduction from each incident.
- Pin dependency versions in production and upgrade with canary validation.
- Monitor 4xx/5xx spikes after upgrades; alert on schema/validation failures.
- Introduce timeouts and retries with exponential backoff for external calls.
- Capture and log raw error messages to enable exact search reproduction.