The Fix
pip install celery==5.3.0a1
Based on closed celery/celery issue #7358 · PR/commit linked
Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.
@@ -351,7 +351,7 @@ def worker(ctx, hostname=None, pool_cls=None, app=None, uid=None, gid=None,
**kwargs)
worker.start()
- return worker.exitcode
+ ctx.exit(worker.exitcode)
except SecurityError as e:
software -> celery:5.2.3 (dawn-chorus) kombu:5.2.3 py:3.9.10
billiard:3.6.4.0 py-amqp:5.0.9
platform -> system:Darwin arch:64bit
kernel version:20.6.0 imp:CPython
loader -> celery.loaders.app.AppLoader
settings -> transport:amqp results:django-db
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Option A — Upgrade to fixed release\npip install celery==5.3.0a1\nWhen NOT to use: Do not use if it changes public behavior or if the failure cannot be reproduced.\n\n
Why This Fix Works in Production
- Trigger: [2022-03-18 17:33:30,670: CRITICAL/MainProcess] Unrecoverable error: ConnectionRefusedError(61, 'Connection refused')
- Mechanism: The Celery worker was not exiting with the correct exit code for non-zero cases
- Why the fix works: Fixes the issue where the Celery worker dies but has an exit code of 0 by ensuring the worker exits with the correct exit code for non-zero cases. (first fixed release: 5.3.0a1).
- If left unfixed, tail latency can spike under load and surface as timeouts/retries (amplifying incident impact).
Why This Breaks in Prod
- Shows up under Python 3.9 in real deployments (not just unit tests).
- The Celery worker was not exiting with the correct exit code for non-zero cases
- Surfaces as: [2022-03-18 17:33:30,670: CRITICAL/MainProcess] Unrecoverable error: ConnectionRefusedError(61, 'Connection refused')
Proof / Evidence
- GitHub issue: #7358
- Fix PR: https://github.com/celery/celery/pull/7544
- First fixed release: 5.3.0a1
- Reproduced locally: No (not executed)
- Last verified: 2026-02-09
- Confidence: 0.85
- Did this fix it?: Yes (upstream fix exists)
- Own content ratio: 0.38
Discussion
High-signal excerpts from the issue thread (symptoms, repros, edge-cases).
“Hey @palfrey :wave:, Thank you for opening an issue”
“are you interested to come up with a possible draft fix?”
“> are you interested to come up with a possible draft fix? Yes, sorry just took quite a while to deal with some internal work…”
Failure Signature (Search String)
- [2022-03-18 17:33:30,670: CRITICAL/MainProcess] Unrecoverable error: ConnectionRefusedError(61, 'Connection refused')
Error Message
Stack trace
Error Message
-------------
[2022-03-18 17:33:30,670: CRITICAL/MainProcess] Unrecoverable error: ConnectionRefusedError(61, 'Connection refused')
Traceback (most recent call last):
File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/worker/worker.py", line 203, in start
self.blueprint.start(self)
File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/bootsteps.py", line 365, in start
return self.obj.start()
File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/worker/consumer/consumer.py", line 326, in start
blueprint.start(self)
File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/worker/consumer/connection.py", line 21, in start
c.connection = c.connect()
File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/worker/consumer/consumer.py", line 422, in connect
conn = self.connection_for_read(heartbeat=self.amqheartbeat)
File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/c
... (truncated) ...
Minimal Reproduction
software -> celery:5.2.3 (dawn-chorus) kombu:5.2.3 py:3.9.10
billiard:3.6.4.0 py-amqp:5.0.9
platform -> system:Darwin arch:64bit
kernel version:20.6.0 imp:CPython
loader -> celery.loaders.app.AppLoader
settings -> transport:amqp results:django-db
Environment
- Python: 3.9
What Broke
Workers terminate unexpectedly without indicating failure, leading to silent job losses.
Why It Broke
The Celery worker was not exiting with the correct exit code for non-zero cases
Fix Options (Details)
Option A — Upgrade to fixed release Safe default (recommended)
pip install celery==5.3.0a1
Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.
Fix reference: https://github.com/celery/celery/pull/7544
First fixed release: 5.3.0a1
Last verified: 2026-02-09. Validate in your environment.
When NOT to Use This Fix
- Do not use if it changes public behavior or if the failure cannot be reproduced.
Verify Fix
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Did This Fix Work in Your Case?
Quick signal helps us prioritize which fixes to verify and improve.
Prevention
- Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
- Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.
Version Compatibility Table
| Version | Status |
|---|---|
| 5.3.0a1 | Fixed |
Related Issues
No related fixes found.
Sources
We don’t republish the full GitHub discussion text. Use the links above for context.