The Fix
pip install celery==5.5.0
Based on closed celery/celery issue #5663 · PR/commit linked
Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.
@@ -412,6 +412,7 @@ def register_with_event_loop(self, hub):
def shutdown(self):
+ self.perform_pending_operations()
self.blueprint.shutdown(self)
import time
from celery import Celery
app = Celery('tasks', broker='redis://localhost:6379/0')
@app.task
def recurse(num_times, sleep_seconds=1):
print("Recursing... num_times={}".format(num_times))
if sleep_seconds > 0:
time.sleep(sleep_seconds)
if num_times > 0:
recurse.delay(num_times-1)
print("Recursed... num_times={}".format(num_times))
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Option A — Upgrade to fixed release\npip install celery==5.5.0\nWhen NOT to use: This fix is not applicable if using a different worker pool or task acknowledgment strategy.\n\nOption C — Workaround\nI found is to deactivate the restore_at_shutdown :\nWhen NOT to use: This fix is not applicable if using a different worker pool or task acknowledgment strategy.\n\n
Why This Fix Works in Production
- Trigger: Completed tasks restore on shutdown causing duplicates for recursive tasks
- Mechanism: The consumer cannot send ack for completed tasks after warm shutdown because the synloop isn't running
- Why the fix works: Resolved an issue where completed tasks were restored on shutdown, causing duplicates for recursive tasks in gevent/eventlet modes. (first fixed release: 5.5.0).
- If left unfixed, retries/timeouts can trigger duplicate external side-effects (double charges, duplicate emails, repeated writes).
Why This Breaks in Prod
- The consumer cannot send ack for completed tasks after warm shutdown because the synloop isn't running
- Production symptom (often without a traceback): Completed tasks restore on shutdown causing duplicates for recursive tasks
Proof / Evidence
- GitHub issue: #5663
- Fix PR: https://github.com/celery/celery/pull/9348
- First fixed release: 5.5.0
- Reproduced locally: No (not executed)
- Last verified: 2026-02-09
- Confidence: 0.75
- Did this fix it?: Yes (upstream fix exists)
- Own content ratio: 0.67
Discussion
High-signal excerpts from the issue thread (symptoms, repros, edge-cases).
“I've seen this PR and it was a clue why it's not working on the shutdown process. How do you think, it's possible to add…”
“> I've seen this PR and it was a clue why it's not working on the shutdown process”
“To add to this, the problem isn't specific to recursive tasks it's just how i discover the behavior and breaks a pattern we use for…”
“Hi @auvipy, it's me again ) It looks like the problem is inside the celery.worker.loops.synloop and relevant only to the gevent/eventlet mode when celery use…”
Failure Signature (Search String)
- Completed tasks restore on shutdown causing duplicates for recursive tasks
- * [x] I have included all related issues and possible duplicate issues
Copy-friendly signature
Failure Signature
-----------------
Completed tasks restore on shutdown causing duplicates for recursive tasks
* [x] I have included all related issues and possible duplicate issues
Error Message
Signature-only (no traceback captured)
Error Message
-------------
Completed tasks restore on shutdown causing duplicates for recursive tasks
* [x] I have included all related issues and possible duplicate issues
Minimal Reproduction
import time
from celery import Celery
app = Celery('tasks', broker='redis://localhost:6379/0')
@app.task
def recurse(num_times, sleep_seconds=1):
print("Recursing... num_times={}".format(num_times))
if sleep_seconds > 0:
time.sleep(sleep_seconds)
if num_times > 0:
recurse.delay(num_times-1)
print("Recursed... num_times={}".format(num_times))
What Broke
Tasks are duplicated due to missing acks after worker shutdown, causing data inconsistency.
Why It Broke
The consumer cannot send ack for completed tasks after warm shutdown because the synloop isn't running
Fix Options (Details)
Option A — Upgrade to fixed release Safe default (recommended)
pip install celery==5.5.0
Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.
Option C — Workaround Temporary workaround
I found is to deactivate the restore_at_shutdown :
Use only if you cannot change versions today. Treat this as a stopgap and remove once upgraded.
Option D — Guard side-effects with OnceOnly Guardrail for side-effects
Mitigate duplicate external side-effects under retries/timeouts/agent loops by gating the operation before calling external systems.
- Place OnceOnly between your code/agent and real side-effects (Stripe, emails, CRM, APIs).
- Use a stable key per side-effect (e.g., customer_id + action + idempotency_key).
- Fail-safe: configure fail-open vs fail-closed based on blast radius and spend risk.
- This is most useful when retries/timeouts can re-trigger the same external call.
Show example snippet (optional)
from onceonly import OnceOnly
import os
once = OnceOnly(api_key=os.environ["ONCEONLY_API_KEY"], fail_open=True)
# Stable idempotency key per real side-effect.
# Use a request id / job id / webhook delivery id / Stripe event id, etc.
event_id = "evt_..." # replace
key = f"stripe:webhook:{event_id}"
res = once.check_lock(key=key, ttl=3600)
if res.duplicate:
return {"status": "already_processed"}
# Safe to execute the side-effect exactly once.
handle_event(event_id)
Fix reference: https://github.com/celery/celery/pull/9348
First fixed release: 5.5.0
Last verified: 2026-02-09. Validate in your environment.
When NOT to Use This Fix
- This fix is not applicable if using a different worker pool or task acknowledgment strategy.
- Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.
Verify Fix
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Did This Fix Work in Your Case?
Quick signal helps us prioritize which fixes to verify and improve.
Prevention
- Capture the exact failing error string in logs and tests so you can reproduce via a minimal script.
- Pin production dependencies and upgrade only with a reproducible test that hits the failing path.
Version Compatibility Table
| Version | Status |
|---|---|
| 5.5.0 | Fixed |
Related Issues
No related fixes found.
Sources
We don’t republish the full GitHub discussion text. Use the links above for context.