The Fix
pip install celery==5.2.1
Based on closed celery/celery issue #6736 · PR/commit linked
Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.
@@ -285,3 +285,4 @@ Garry Lawrence, 2021/06/19
Konstantin Kochin, 2021/07/11
kronion, 2021/08/26
+Gabor Boros, 2021/11/09
diff --git a/celery/app/base.py b/celery/app/base.py
index 0b893fddb87..671fc846ac6 100644
software -> celery:5.1.0b1 (singularity) kombu:5.1.0b1 py:3.8.8
billiard:3.6.4.0 redis:3.5.3
platform -> system:Linux arch:64bit
kernel version:5.11.1-1.el7.elrepo.x86_64 imp:CPython
loader -> celery.loaders.app.AppLoader
settings -> transport:redis results:redis://redis:6379/
broker_url: 'redis://redis:6379//'
result_backend: 'redis://redis:6379/'
include: ['myapp.service.celery.tasks']
deprecated_settings: None
accept_content: ['json', 'pickle']
result_accept_content: ['json', 'pickle']
task_serializer: 'pickle'
result_serializer: 'pickle'
task_track_started: False
task_store_errors_even_if_ignored: False
result_expires: datetime.timedelta(seconds=600)
result_extended: False
result_persistent: False
broker_connection_timeout: 30
beat_max_loop_interval: 60
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Option A — Upgrade to fixed release\npip install celery==5.2.1\nWhen NOT to use: Do not apply this fix if your application relies on non-timezone aware expiration handling.\n\nOption C — Workaround\n@thedrow @nicolaerosia\nWhen NOT to use: Do not apply this fix if your application relies on non-timezone aware expiration handling.\n\n
Why This Fix Works in Production
- Trigger: [2021-04-21 03:39:44,629: CRITICAL/MainProcess] Unrecoverable error: RecursionError('maximum recursion depth exceeded while calling a Python object')
- Mechanism: Task expiration handling was not timezone aware, causing recursion errors when tasks were revoked
- Why the fix works: Fixes the task expiration handling to be timezone aware, addressing issues related to maximum recursion depth errors when tasks are revoked. (first fixed release: 5.2.1).
- If left unfixed, failures can be intermittent under concurrency (hard to reproduce; shows up as sporadic 5xx/timeouts).
Why This Breaks in Prod
- Shows up under Python 3.8 in real deployments (not just unit tests).
- Task expiration handling was not timezone aware, causing recursion errors when tasks were revoked
- Surfaces as: [2021-04-21 03:39:44,629: CRITICAL/MainProcess] Unrecoverable error: RecursionError('maximum recursion depth exceeded while calling a Python object')
Proof / Evidence
- GitHub issue: #6736
- Fix PR: https://github.com/celery/celery/pull/7065
- First fixed release: 5.2.1
- Reproduced locally: No (not executed)
- Last verified: 2026-02-09
- Confidence: 0.75
- Did this fix it?: Yes (upstream fix exists)
- Own content ratio: 0.22
Discussion
High-signal excerpts from the issue thread (symptoms, repros, edge-cases).
“> @auvipy @Elias-SLH we are using the fix in production for almost a year and never seen the issue again. If it still exists (and…”
“the issue is still there, but I don't have bandwidth to provide a reproducer since I have this workaround”
“@auvipy @Elias-SLH we are using the fix in production for almost a year and never seen the issue again. If it still exists (and it…”
“Hey @nicolaerosia :wave:, Thank you for opening an issue”
Failure Signature (Search String)
- [2021-04-21 03:39:44,629: CRITICAL/MainProcess] Unrecoverable error: RecursionError('maximum recursion depth exceeded while calling a Python object')
Error Message
Stack trace
Error Message
-------------
[2021-04-21 03:39:44,629: CRITICAL/MainProcess] Unrecoverable error: RecursionError('maximum recursion depth exceeded while calling a Python object')
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/celery/worker/worker.py", line 226, in _process_task
req.execute_using_pool(self.pool)
File "/opt/conda/lib/python3.8/site-packages/celery/worker/request.py", line 652, in execute_using_pool
raise TaskRevokedError(task_id)
celery.exceptions.TaskRevokedError: 5ccb493c-7a23-468f-9016-2a6d21d84d0c
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/celery/worker/worker.py", line 226, in _process_task
req.execute_using_pool(self.pool)
File "/opt/conda/lib/python3.8/site-packages/celery/worker/request.py", line 652, in execute_using_pool
raise TaskRevokedError(task_id)
celery.exceptions.TaskRevokedError: 0a0d96d6-11fb-4b35-b8c4-ce9c6c72ecbc
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/celery/worker/worker.py", line 226, in _process_task
req.execute_using_pool(self.pool)
File "/opt/conda/lib/python3.8/site-packages/celery/worker/request.py", line 652, in execute_using_pool
raise TaskRevokedError(task_id)
celery.exceptions.T
... (truncated) ...
Stack trace
Error Message
-------------
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/celery/worker/worker.py", line 203, in start
self.blueprint.start(self)
File "/opt/conda/lib/python3.8/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/opt/conda/lib/python3.8/site-packages/celery/bootsteps.py", line 365, in start
return self.obj.start()
File "/opt/conda/lib/python3.8/site-packages/celery/worker/consumer/consumer.py", line 311, in start
blueprint.start(self)
File "/opt/conda/lib/python3.8/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/opt/conda/lib/python3.8/site-packages/celery/worker/consumer/consumer.py", line 592, in start
c.loop(*c.loop_args())
File "/opt/conda/lib/python3.8/site-packages/celery/worker/loops.py", line 81, in asynloop
next(loop)
File "/opt/conda/lib/python3.8/site-packages/kombu/asynchronous/hub.py", line 361, in create_loop
cb(*cbargs)
File "/opt/conda/lib/python3.8/site-packages/celery/concurrency/asynpool.py", line 325, in on_result_readable
next(it)
File "/opt/conda/lib/python3.8/site-packages/celery/concurrency/asynpool.py", line 308, in _recv_message
callback(message)
File "/opt/conda/lib/python3.8/site-packages/billiard/pool.py", line 853, in on_state_change
...
... (truncated) ...
Stack trace
Error Message
-------------
File "/opt/conda/lib/python3.8/site-packages/kombu/asynchronous/semaphore.py", line 72, in release
waiter(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/celery/worker/worker.py", line 226, in _process_task
req.execute_using_pool(self.pool)
File "/opt/conda/lib/python3.8/site-packages/celery/worker/request.py", line 651, in execute_using_pool
if (self.expires or task_id in revoked_tasks) and self.revoked():
File "/opt/conda/lib/python3.8/site-packages/celery/worker/request.py", line 424, in revoked
self._announce_revoked(
File "/opt/conda/lib/python3.8/site-packages/celery/worker/request.py", line 406, in _announce_revoked
self.task.backend.mark_as_revoked(
File "/opt/conda/lib/python3.8/site-packages/celery/backends/base.py", line 228, in mark_as_revoked
self.store_result(task_id, exc, state,
File "/opt/conda/lib/python3.8/site-packages/celery/backends/base.py", line 440, in store_result
self._store_result(task_id, result, state, traceback,
File "/opt/conda/lib/python3.8/site-packages/celery/backends/base.py", line 867, in _store_result
self._set_with_state(self.get_key_for_task(task_id), self.encode(meta), state)
File "/opt/conda/lib/python3.8/site-packages/celery/backends/base.py", line 744, in _set_with_state
return self.set(key, value)
File "/opt/conda/lib/python3.8/site-packages/celery/backends/redis.py",
... (truncated) ...
Minimal Reproduction
software -> celery:5.1.0b1 (singularity) kombu:5.1.0b1 py:3.8.8
billiard:3.6.4.0 redis:3.5.3
platform -> system:Linux arch:64bit
kernel version:5.11.1-1.el7.elrepo.x86_64 imp:CPython
loader -> celery.loaders.app.AppLoader
settings -> transport:redis results:redis://redis:6379/
broker_url: 'redis://redis:6379//'
result_backend: 'redis://redis:6379/'
include: ['myapp.service.celery.tasks']
deprecated_settings: None
accept_content: ['json', 'pickle']
result_accept_content: ['json', 'pickle']
task_serializer: 'pickle'
result_serializer: 'pickle'
task_track_started: False
task_store_errors_even_if_ignored: False
result_expires: datetime.timedelta(seconds=600)
result_extended: False
result_persistent: False
broker_connection_timeout: 30
beat_max_loop_interval: 60
Environment
- Python: 3.8
What Broke
Workers encountered maximum recursion depth errors when processing revoked tasks, leading to task failures.
Why It Broke
Task expiration handling was not timezone aware, causing recursion errors when tasks were revoked
Fix Options (Details)
Option A — Upgrade to fixed release Safe default (recommended)
pip install celery==5.2.1
Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.
Option C — Workaround Temporary workaround
@thedrow @nicolaerosia
Use only if you cannot change versions today. Treat this as a stopgap and remove once upgraded.
Option D — Guard side-effects with OnceOnly Guardrail for side-effects
Mitigate duplicate external side-effects under retries/timeouts/agent loops by gating the operation before calling external systems.
- Place OnceOnly between your code/agent and real side-effects (Stripe, emails, CRM, APIs).
- Use a stable key per side-effect (e.g., customer_id + action + idempotency_key).
- Fail-safe: configure fail-open vs fail-closed based on blast radius and spend risk.
Show example snippet (optional)
from onceonly import OnceOnly
import os
once = OnceOnly(api_key=os.environ["ONCEONLY_API_KEY"], fail_open=True)
# Stable idempotency key per real side-effect.
# Use a request id / job id / webhook delivery id / Stripe event id, etc.
event_id = "evt_..." # replace
key = f"stripe:webhook:{event_id}"
res = once.check_lock(key=key, ttl=3600)
if res.duplicate:
return {"status": "already_processed"}
# Safe to execute the side-effect exactly once.
handle_event(event_id)
Fix reference: https://github.com/celery/celery/pull/7065
First fixed release: 5.2.1
Last verified: 2026-02-09. Validate in your environment.
When NOT to Use This Fix
- Do not apply this fix if your application relies on non-timezone aware expiration handling.
- Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.
Verify Fix
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Did This Fix Work in Your Case?
Quick signal helps us prioritize which fixes to verify and improve.
Prevention
- Add a stress test that runs high-concurrency workloads and fails on thread dumps / blocked locks.
- Enable watchdog dumps in prod (faulthandler, thread dump endpoint) to capture deadlocks quickly.
Version Compatibility Table
| Version | Status |
|---|---|
| 5.2.1 | Fixed |
Related Issues
No related fixes found.
Sources
We don’t republish the full GitHub discussion text. Use the links above for context.