Jump to solution
Verify

The Fix

pip install celery==5.3.0a1

Based on closed celery/celery issue #7358 · PR/commit linked

Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.

Jump to Verify Open PR/Commit
@@ -351,7 +351,7 @@ def worker(ctx, hostname=None, pool_cls=None, app=None, uid=None, gid=None, **kwargs) worker.start() - return worker.exitcode + ctx.exit(worker.exitcode) except SecurityError as e:
repro.py
software -> celery:5.2.3 (dawn-chorus) kombu:5.2.3 py:3.9.10 billiard:3.6.4.0 py-amqp:5.0.9 platform -> system:Darwin arch:64bit kernel version:20.6.0 imp:CPython loader -> celery.loaders.app.AppLoader settings -> transport:amqp results:django-db
verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
fix.md
Option A — Upgrade to fixed release\npip install celery==5.3.0a1\nWhen NOT to use: Do not use if it changes public behavior or if the failure cannot be reproduced.\n\n

Why This Fix Works in Production

  • Trigger: [2022-03-18 17:33:30,670: CRITICAL/MainProcess] Unrecoverable error: ConnectionRefusedError(61, 'Connection refused')
  • Mechanism: The Celery worker was not exiting with the correct exit code for non-zero cases
  • Why the fix works: Fixes the issue where the Celery worker dies but has an exit code of 0 by ensuring the worker exits with the correct exit code for non-zero cases. (first fixed release: 5.3.0a1).
Production impact:
  • If left unfixed, tail latency can spike under load and surface as timeouts/retries (amplifying incident impact).

Why This Breaks in Prod

  • Shows up under Python 3.9 in real deployments (not just unit tests).
  • The Celery worker was not exiting with the correct exit code for non-zero cases
  • Surfaces as: [2022-03-18 17:33:30,670: CRITICAL/MainProcess] Unrecoverable error: ConnectionRefusedError(61, 'Connection refused')

Proof / Evidence

  • GitHub issue: #7358
  • Fix PR: https://github.com/celery/celery/pull/7544
  • First fixed release: 5.3.0a1
  • Reproduced locally: No (not executed)
  • Last verified: 2026-02-09
  • Confidence: 0.85
  • Did this fix it?: Yes (upstream fix exists)
  • Own content ratio: 0.38

Discussion

High-signal excerpts from the issue thread (symptoms, repros, edge-cases).

“Hey @palfrey :wave:, Thank you for opening an issue”
@open-collective-bot · 2022-03-18 · source
“are you interested to come up with a possible draft fix?”
@auvipy · 2022-03-22 · source
“> are you interested to come up with a possible draft fix? Yes, sorry just took quite a while to deal with some internal work…”
@palfrey · 2022-06-01 · source

Failure Signature (Search String)

  • [2022-03-18 17:33:30,670: CRITICAL/MainProcess] Unrecoverable error: ConnectionRefusedError(61, 'Connection refused')

Error Message

Stack trace
error.txt
Error Message ------------- [2022-03-18 17:33:30,670: CRITICAL/MainProcess] Unrecoverable error: ConnectionRefusedError(61, 'Connection refused') Traceback (most recent call last): File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/worker/worker.py", line 203, in start self.blueprint.start(self) File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/bootsteps.py", line 116, in start step.start(parent) File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/bootsteps.py", line 365, in start return self.obj.start() File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/worker/consumer/consumer.py", line 326, in start blueprint.start(self) File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/bootsteps.py", line 116, in start step.start(parent) File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/worker/consumer/connection.py", line 21, in start c.connection = c.connect() File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/celery/worker/consumer/consumer.py", line 422, in connect conn = self.connection_for_read(heartbeat=self.amqheartbeat) File "/Users/tomparker-shemilt/.virtualenvs/case-cards/lib/python3.9/site-packages/c ... (truncated) ...

Minimal Reproduction

repro.py
software -> celery:5.2.3 (dawn-chorus) kombu:5.2.3 py:3.9.10 billiard:3.6.4.0 py-amqp:5.0.9 platform -> system:Darwin arch:64bit kernel version:20.6.0 imp:CPython loader -> celery.loaders.app.AppLoader settings -> transport:amqp results:django-db

Environment

  • Python: 3.9

What Broke

Workers terminate unexpectedly without indicating failure, leading to silent job losses.

Why It Broke

The Celery worker was not exiting with the correct exit code for non-zero cases

Fix Options (Details)

Option A — Upgrade to fixed release Safe default (recommended)

pip install celery==5.3.0a1

When NOT to use: Do not use if it changes public behavior or if the failure cannot be reproduced.

Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.

Fix reference: https://github.com/celery/celery/pull/7544

First fixed release: 5.3.0a1

Last verified: 2026-02-09. Validate in your environment.

Get updates

We publish verified fixes weekly. No spam.

Subscribe

When NOT to Use This Fix

  • Do not use if it changes public behavior or if the failure cannot be reproduced.

Verify Fix

verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

Did This Fix Work in Your Case?

Quick signal helps us prioritize which fixes to verify and improve.

Prevention

  • Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
  • Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.

Version Compatibility Table

VersionStatus
5.3.0a1 Fixed

Related Issues

No related fixes found.

Sources

We don’t republish the full GitHub discussion text. Use the links above for context.