The Fix
pip install celery==4.4.0rc5
Based on closed celery/celery issue #4457 · PR/commit linked
Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.
@@ -254,6 +254,7 @@ Andrew Wong, 2017/09/07
Tobias 'rixx' Kunze, 2017/08/20
Mikhail Wolfson, 2017/12/11
+Matt Davis, 2017/12/13
Alex Garel, 2018/01/04
Régis Behmo 2018/01/20
[user] celery.worker.consumer.consumer WARNING 2017-12-18 00:38:27,078 consumer:
Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 320, in start
blueprint.start(self)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 596, in start
c.loop(*c.loop_args())
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/loops.py", line 47, in asynloop
obj.controller.register_with_event_loop(hub)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/worker.py", line 217, in register_with_event_loop
description='hub.register',
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/bootsteps.py", line 151, in send_all
fun(parent, *args)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/components.py", line 178, in register_with_event_loop
w.pool.register_with_event_loop(hub)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/concurrency/prefork.py", line 134, in register_with_event_loop
return reg(loop)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/concurrency/asynpool.py", line 476, in register_with_event_loop
for fd in self._fileno_to_outq]
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/concurrency/asynpool.py", line 476, in <listcomp>
for fd in self._fileno_to_outq]
File "/home/user/.envs/user/lib/python3.6/site-packages/kombu/async/hub.py", line 207, in add_reader
return self.add(fds, callback, READ | ERR, args)
File "/home/user/.envs/user/lib/python3.6/site-packages/kombu/async/hub.py", line 158, in add
self.poller.register(fd, flags)
File "/home/user/.envs/user/lib/python3.6/site-packages/kombu/utils/eventio.py", line 67, in register
self._epoll.register(fd, events)
OSError: [Errno 9] Bad file descriptor
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Option A — Upgrade to fixed release\npip install celery==4.4.0rc5\nWhen NOT to use: Do not apply this fix if using a broker other than RabbitMQ or Redis.\n\n
Why This Fix Works in Production
- Trigger: [user] celery.worker.consumer.consumer WARNING 2017-12-18 00:38:27,078 consumer:
- Mechanism: The connection to the broker fails due to a bad file descriptor error during re-establishment
- Why the fix works: Adds a test case that demonstrates the issue of losing connection to the broker and provides a potential fix for the OSError: [Errno 9] Bad file descriptor. (first fixed release: 4.4.0rc5).
- If left unfixed, retry loops can amplify load and turn a small outage into a cascade (thundering herd).
Why This Breaks in Prod
- Shows up under Python 3.6.1 in real deployments (not just unit tests).
- The connection to the broker fails due to a bad file descriptor error during re-establishment
- Surfaces as: [user] celery.worker.consumer.consumer WARNING 2017-12-18 00:38:27,078 consumer:
Proof / Evidence
- GitHub issue: #4457
- Fix PR: https://github.com/celery/celery/pull/5499
- First fixed release: 4.4.0rc5
- Reproduced locally: No (not executed)
- Last verified: 2026-02-09
- Confidence: 0.85
- Did this fix it?: Yes (upstream fix exists)
- Own content ratio: 0.21
Discussion
High-signal excerpts from the issue thread (symptoms, repros, edge-cases).
“FYI. In my case, I used chain to run task2 after task1 finished. When the return value of task1 is too large (~100M) and large…”
“I work on an open source project that uses celery and is encountering this bug also, though we're not using redis as a result backend”
“could you both try celery 4.2rc4 and see if it still persist?”
“I upgrade to celery 4.2.0,still have the same issue”
Failure Signature (Search String)
- [user] celery.worker.consumer.consumer WARNING 2017-12-18 00:38:27,078 consumer:
Error Message
Stack trace
Error Message
-------------
[user] celery.worker.consumer.consumer WARNING 2017-12-18 00:38:27,078 consumer:
Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 320, in start
blueprint.start(self)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 596, in start
c.loop(*c.loop_args())
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/loops.py", line 47, in asynloop
obj.controller.register_with_event_loop(hub)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/worker.py", line 217, in register_with_event_loop
description='hub.register',
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/bootsteps.py", line 151, in send_all
fun(parent, *args)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/components.py", line 178, in register_with_event_loop
w.pool.register_with_event_loop(hub)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/concurrency/prefork.py", line 134, in register_with_event_loop
return reg(loop)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery
... (truncated) ...
Stack trace
Error Message
-------------
[2018-10-10 12:25:27,607: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/opt/pyenv/versions/3.6.6/envs/remains/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 317, in start
blueprint.start(self)
File "/opt/pyenv/versions/3.6.6/envs/remains/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/opt/pyenv/versions/3.6.6/envs/remains/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 593, in start
c.loop(*c.loop_args())
File "/opt/pyenv/versions/3.6.6/envs/remains/lib/python3.6/site-packages/celery/worker/loops.py", line 91, in asynloop
next(loop)
File "/opt/pyenv/versions/3.6.6/envs/remains/lib/python3.6/site-packages/kombu/asynchronous/hub.py", line 354, in create_loop
cb(*cbargs)
File "/opt/pyenv/versions/3.6.6/envs/remains/lib/python3.6/site-packages/kombu/transport/base.py", line 236, in on_readable
reader(loop)
File "/opt/pyenv/versions/3.6.6/envs/remains/lib/python3.6/site-packages/kombu/transport/base.py", line 218, in _read
drain_events(timeout=0)
File "/opt/pyenv/versions/3.6.6/envs/remains/lib/python3.6/site-packages/amqp/connection.py", line 491, in drain_events
while not self.blocking_read(timeout):
File "/opt/pyenv/versions/3.6.6/envs/remains/lib
... (truncated) ...
Minimal Reproduction
[user] celery.worker.consumer.consumer WARNING 2017-12-18 00:38:27,078 consumer:
Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 320, in start
blueprint.start(self)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 596, in start
c.loop(*c.loop_args())
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/loops.py", line 47, in asynloop
obj.controller.register_with_event_loop(hub)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/worker.py", line 217, in register_with_event_loop
description='hub.register',
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/bootsteps.py", line 151, in send_all
fun(parent, *args)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/worker/components.py", line 178, in register_with_event_loop
w.pool.register_with_event_loop(hub)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/concurrency/prefork.py", line 134, in register_with_event_loop
return reg(loop)
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/concurrency/asynpool.py", line 476, in register_with_event_loop
for fd in self._fileno_to_outq]
File "/home/user/.envs/user/lib/python3.6/site-packages/celery/concurrency/asynpool.py", line 476, in <listcomp>
for fd in self._fileno_to_outq]
File "/home/user/.envs/user/lib/python3.6/site-packages/kombu/async/hub.py", line 207, in add_reader
return self.add(fds, callback, READ | ERR, args)
File "/home/user/.envs/user/lib/python3.6/site-packages/kombu/async/hub.py", line 158, in add
self.poller.register(fd, flags)
File "/home/user/.envs/user/lib/python3.6/site-packages/kombu/utils/eventio.py", line 67, in register
self._epoll.register(fd, events)
OSError: [Errno 9] Bad file descriptor
Environment
- Python: 3.6.1
What Broke
Workers lose connection to the broker, causing task processing to halt until manually restarted.
Why It Broke
The connection to the broker fails due to a bad file descriptor error during re-establishment
Fix Options (Details)
Option A — Upgrade to fixed release Safe default (recommended)
pip install celery==4.4.0rc5
Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.
Fix reference: https://github.com/celery/celery/pull/5499
First fixed release: 4.4.0rc5
Last verified: 2026-02-09. Validate in your environment.
When NOT to Use This Fix
- Do not apply this fix if using a broker other than RabbitMQ or Redis.
Verify Fix
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Did This Fix Work in Your Case?
Quick signal helps us prioritize which fixes to verify and improve.
Prevention
- Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
- Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.
Version Compatibility Table
| Version | Status |
|---|---|
| 4.4.0rc5 | Fixed |
Related Issues
No related fixes found.
Sources
We don’t republish the full GitHub discussion text. Use the links above for context.