The Fix
pip install celery==5.3.1
Based on closed celery/celery issue #8310 · PR/commit linked
Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.
@@ -1 +1 @@
@@ -1 +1 @@
-redis>=4.5.2
+redis>=4.5.2,!=4.5.5
celery -A uniweb report
software -> celery:5.2.7 (dawn-chorus) kombu:5.3.0 py:3.9.17
billiard:3.6.4.0 redis:4.5.4
platform -> system:Linux arch:64bit
kernel version:3.10.0-1160.88.1.el7.x86_64 imp:CPython
loader -> celery.loaders.app.AppLoader
settings -> transport:sentinel results:redis://:**@django-cms-test-redis-headless:6379/7
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Option A — Upgrade to fixed release\npip install celery==5.3.1\nWhen NOT to use: Do not use if it changes public behavior or if the failure cannot be reproduced.\n\n
Why This Fix Works in Production
- Trigger: celery.worker - start - CRITICAL - Unrecoverable error: TypeError("SentinelManagedConnection.read_response() got an unexpected keyword argument…
- Mechanism: Restricts the use of Redis version 4.5.5 due to severe bugs that cause Celery workers to crash without any message.
- Why the fix works: Restricts the use of Redis version 4.5.5 due to severe bugs that cause Celery workers to crash without any message. (first fixed release: 5.3.1).
- If left unfixed, the same config can fail only in production (env differences), causing startup failures or partial feature outages.
Why This Breaks in Prod
- Shows up under Python 3.10 in real deployments (not just unit tests).
- Surfaces as: celery.worker - start - CRITICAL - Unrecoverable error: TypeError("SentinelManagedConnection.read_response() got an unexpected keyword argument 'disconnect_on_error'")
Proof / Evidence
- GitHub issue: #8310
- Fix PR: https://github.com/celery/celery/pull/8317
- First fixed release: 5.3.1
- Reproduced locally: No (not executed)
- Last verified: 2026-02-09
- Confidence: 0.95
- Did this fix it?: Yes (upstream fix exists)
- Own content ratio: 0.40
Discussion
High-signal excerpts from the issue thread (symptoms, repros, edge-cases).
“Found the problem: This was caused by https://github.com/redis/redis-py/commit/c0833f60a1d9ec85c589004aba6b6739e6298248 which changed how internally redis-py handles connection, unfortunatelly the Sentinel wrapper was not updated to reflect that…”
“Hey @jrief :wave:, Thank you for opening an issue”
“> Our setup is Celery with Redis over Sentinel in a Kubernetes cluster”
“I should have mentioned redis-py. A hint on this would be useful, it took me almost a day to find the culprit. As I said,…”
Failure Signature (Search String)
- celery.worker - start - CRITICAL - Unrecoverable error: TypeError("SentinelManagedConnection.read_response() got an unexpected keyword argument 'disconnect_on_error'")
Error Message
Stack trace
Error Message
-------------
celery.worker - start - CRITICAL - Unrecoverable error: TypeError("SentinelManagedConnection.read_response() got an unexpected keyword argument 'disconnect_on_error'")
Traceback (most recent call last):
File "/opt/venv/lib/python3.10/site-packages/celery/worker/worker.py", line 203, in start
self.blueprint.start(self)
File "/opt/venv/lib/python3.10/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/opt/venv/lib/python3.10/site-packages/celery/bootsteps.py", line 365, in start
return self.obj.start()
File "/opt/venv/lib/python3.10/site-packages/celery/worker/consumer/consumer.py", line 332, in start
blueprint.start(self)
File "/opt/venv/lib/python3.10/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/opt/venv/lib/python3.10/site-packages/celery/worker/consumer/consumer.py", line 628, in start
c.loop(*c.loop_args())
File "/opt/venv/lib/python3.10/site-packages/celery/worker/loops.py", line 97, in asynloop
next(loop)
File "/opt/venv/lib/python3.10/site-packages/kombu/asynchronous/hub.py", line 373, in create_loop
cb(*cbargs)
File "/opt/venv/lib/python3.10/site-packages/kombu/transport/redis.py", line 1336, in on_readable
self.cycle.on_readable(fileno)
File "/opt/venv/lib/python3.10/site-packages/kombu/transport/redis.py", line 566, in on_readable
chan.handlers[t
... (truncated) ...
Minimal Reproduction
celery -A uniweb report
software -> celery:5.2.7 (dawn-chorus) kombu:5.3.0 py:3.9.17
billiard:3.6.4.0 redis:4.5.4
platform -> system:Linux arch:64bit
kernel version:3.10.0-1160.88.1.el7.x86_64 imp:CPython
loader -> celery.loaders.app.AppLoader
settings -> transport:sentinel results:redis://:**@django-cms-test-redis-headless:6379/7
Environment
- Python: 3.10
What Broke
Celery workers abort without any message, leading to task processing failures.
Fix Options (Details)
Option A — Upgrade to fixed release Safe default (recommended)
pip install celery==5.3.1
Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.
Option D — Guard side-effects with OnceOnly Guardrail for side-effects
Mitigate duplicate external side-effects under retries/timeouts/agent loops by gating the operation before calling external systems.
- Place OnceOnly between your code/agent and real side-effects (Stripe, emails, CRM, APIs).
- Use a stable key per side-effect (e.g., customer_id + action + idempotency_key).
- Fail-safe: configure fail-open vs fail-closed based on blast radius and spend risk.
Show example snippet (optional)
from onceonly import OnceOnly
import os
once = OnceOnly(api_key=os.environ["ONCEONLY_API_KEY"], fail_open=True)
# Stable idempotency key per real side-effect.
# Use a request id / job id / webhook delivery id / Stripe event id, etc.
event_id = "evt_..." # replace
key = f"stripe:webhook:{event_id}"
res = once.check_lock(key=key, ttl=3600)
if res.duplicate:
return {"status": "already_processed"}
# Safe to execute the side-effect exactly once.
handle_event(event_id)
Fix reference: https://github.com/celery/celery/pull/8317
First fixed release: 5.3.1
Last verified: 2026-02-09. Validate in your environment.
When NOT to Use This Fix
- Do not use if it changes public behavior or if the failure cannot be reproduced.
- Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.
Verify Fix
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Did This Fix Work in Your Case?
Quick signal helps us prioritize which fixes to verify and improve.
Prevention
- Add a TLS smoke test that performs a real handshake in CI (include CA bundle validation and hostname checks).
- Alert on handshake failures by error string and endpoint to catch cert/CA changes quickly.
Version Compatibility Table
| Version | Status |
|---|---|
| 5.3.1 | Fixed |
Related Issues
No related fixes found.
Sources
We don’t republish the full GitHub discussion text. Use the links above for context.