The Fix
pip install celery==5.6.1
Based on closed celery/celery issue #5998 · PR/commit linked
Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.
@@ -3,11 +3,14 @@
"""
import os
+import threading
+import time
from time import sleep
from celery import Celery
app = Celery('tasks', broker='pyamqp://rabbitmq:rabbitmq@localhost//')
app.conf.update(
result_backend='redis://localhost',
task_soft_time_limit=3600,
task_time_limit=4000,
task_acks_late=True,
task_acks_on_failure_or_timeout=True,
worker_max_tasks_per_child=1,
worker_concurrency=1,
worker_prefetch_multiplier=1,
broker_heartbeat=10,
task_default_queue="heartbeat-testing"
)
@app.task(name='heartbeat-test-task')
def add(x, y):
sleep(30)
return x+y
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Option A — Upgrade to fixed release\npip install celery==5.6.1\nWhen NOT to use: Disabling heartbeats can lead to stale connections and message loss.\n\nOption C — Workaround\nthis issue. pls try.\nWhen NOT to use: Disabling heartbeats can lead to stale connections and message loss.\n\n
Why This Fix Works in Production
- Trigger: [2020-03-27 22:08:21,036.36] INFO [8928 139780993414976] [celery.app.trace:124] Task…
- Mechanism: The event loop exits before the TaskPool can drain tasks during graceful shutdown
- Why the fix works: Fixes the issue where broker heartbeats were not sent during graceful shutdown by ensuring that the event loop continues to fire timers while the task pool drains. (first fixed release: 5.6.1).
- If left unfixed, the same config can fail only in production (env differences), causing startup failures or partial feature outages.
Why This Breaks in Prod
- Shows up under Python 3.6 in real deployments (not just unit tests).
- The event loop exits before the TaskPool can drain tasks during graceful shutdown
- Surfaces as: [2020-03-27 22:08:21,036.36] INFO [8928 139780993414976] [celery.app.trace:124] Task yahoo.contrib.yxs2.tasks.job_queue_decider[3212ba65-eaec-4e4b-96f1-b8317bb098f7] succeeded in…
Proof / Evidence
- GitHub issue: #5998
- Fix PR: https://github.com/celery/celery/pull/9986
- First fixed release: 5.6.1
- Reproduced locally: No (not executed)
- Last verified: 2026-02-09
- Confidence: 0.85
- Did this fix it?: Yes (upstream fix exists)
- Own content ratio: 0.39
Discussion
High-signal excerpts from the issue thread (symptoms, repros, edge-cases).
“I can reproduce this with: and We are also forced to disable heartbeats due to this which is not ideal.”
“This is really bad. Are there any plans to work on this issue?”
“can you try celery master? would be happy to discuss in a PR”
“@hmatland we team hit same issue, just disable the heatbeart can workaround this issue. pls try. broker_heartbeat = 0 https://www.rabbitmq.com/heartbeats.html”
Failure Signature (Search String)
- [2020-03-27 22:08:21,036.36] INFO [8928 139780993414976] [celery.app.trace:124] Task yahoo.contrib.yxs2.tasks.job_queue_decider[3212ba65-eaec-4e4b-96f1-b8317bb098f7] succeeded in
Error Message
Stack trace
Error Message
-------------
[2020-03-27 22:08:21,036.36] INFO [8928 139780993414976] [celery.app.trace:124] Task yahoo.contrib.yxs2.tasks.job_queue_decider[3212ba65-eaec-4e4b-96f1-b8317bb098f7] succeeded in 460.4728824859776s: None
[2020-03-27 22:08:22,142.142] DEBUG [8901 139780993414976] [celery.bootsteps:262] | Worker: Stopping Hub...
[2020-03-27 22:08:22,142.142] CRITICAL [8901 139780993414976] [celery.worker.request:134] Couldn't ack 3, reason:ConnectionResetError(104, 'Connection reset by peer')
Traceback (most recent call last):
File "/home/ec2-user/yxs2/lib/python3.6/site-packages/celery/worker/worker.py", line 205, in start
self.blueprint.start(self)
File "/home/ec2-user/yxs2/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/home/ec2-user/yxs2/lib/python3.6/site-packages/celery/bootsteps.py", line 369, in start
return self.obj.start()
File "/home/ec2-user/yxs2/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 318, in start
blueprint.start(self)
File "/home/ec2-user/yxs2/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/home/ec2-user/yxs2/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 596, in start
c.loop(*c.loop_args())
File "/home/ec2-user/yxs2/lib/python3.6/site-packages/celery/worker/loops.py", line 91, in asynloop
next(l
... (truncated) ...
Minimal Reproduction
from time import sleep
from celery import Celery
app = Celery('tasks', broker='pyamqp://rabbitmq:rabbitmq@localhost//')
app.conf.update(
result_backend='redis://localhost',
task_soft_time_limit=3600,
task_time_limit=4000,
task_acks_late=True,
task_acks_on_failure_or_timeout=True,
worker_max_tasks_per_child=1,
worker_concurrency=1,
worker_prefetch_multiplier=1,
broker_heartbeat=10,
task_default_queue="heartbeat-testing"
)
@app.task(name='heartbeat-test-task')
def add(x, y):
sleep(30)
return x+y
Environment
- Python: 3.6
What Broke
Tasks were redelivered due to RabbitMQ connection drops during shutdown.
Why It Broke
The event loop exits before the TaskPool can drain tasks during graceful shutdown
Fix Options (Details)
Option A — Upgrade to fixed release Safe default (recommended)
pip install celery==5.6.1
Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.
Option C — Workaround Temporary workaround
this issue. pls try.
Use only if you cannot change versions today. Treat this as a stopgap and remove once upgraded.
Fix reference: https://github.com/celery/celery/pull/9986
First fixed release: 5.6.1
Last verified: 2026-02-09. Validate in your environment.
When NOT to Use This Fix
- Disabling heartbeats can lead to stale connections and message loss.
Verify Fix
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Did This Fix Work in Your Case?
Quick signal helps us prioritize which fixes to verify and improve.
Prevention
- Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
- Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.
Version Compatibility Table
| Version | Status |
|---|---|
| 5.6.1 | Fixed |
Related Issues
No related fixes found.
Sources
We don’t republish the full GitHub discussion text. Use the links above for context.