Jump to solution
Verify

The Fix

pip install celery==5.5.0

Based on closed celery/celery issue #5663 · PR/commit linked

Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.

Jump to Verify Open PR/Commit
@@ -412,6 +412,7 @@ def register_with_event_loop(self, hub): def shutdown(self): + self.perform_pending_operations() self.blueprint.shutdown(self)
repro.py
import time from celery import Celery app = Celery('tasks', broker='redis://localhost:6379/0') @app.task def recurse(num_times, sleep_seconds=1): print("Recursing... num_times={}".format(num_times)) if sleep_seconds > 0: time.sleep(sleep_seconds) if num_times > 0: recurse.delay(num_times-1) print("Recursed... num_times={}".format(num_times))
verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
fix.md
Option A — Upgrade to fixed release\npip install celery==5.5.0\nWhen NOT to use: This fix is not applicable if using a different worker pool or task acknowledgment strategy.\n\nOption C — Workaround\nI found is to deactivate the restore_at_shutdown :\nWhen NOT to use: This fix is not applicable if using a different worker pool or task acknowledgment strategy.\n\n

Why This Fix Works in Production

  • Trigger: Completed tasks restore on shutdown causing duplicates for recursive tasks
  • Mechanism: The consumer cannot send ack for completed tasks after warm shutdown because the synloop isn't running
  • Why the fix works: Resolved an issue where completed tasks were restored on shutdown, causing duplicates for recursive tasks in gevent/eventlet modes. (first fixed release: 5.5.0).
Production impact:
  • If left unfixed, retries/timeouts can trigger duplicate external side-effects (double charges, duplicate emails, repeated writes).

Why This Breaks in Prod

  • The consumer cannot send ack for completed tasks after warm shutdown because the synloop isn't running
  • Production symptom (often without a traceback): Completed tasks restore on shutdown causing duplicates for recursive tasks

Proof / Evidence

  • GitHub issue: #5663
  • Fix PR: https://github.com/celery/celery/pull/9348
  • First fixed release: 5.5.0
  • Reproduced locally: No (not executed)
  • Last verified: 2026-02-09
  • Confidence: 0.75
  • Did this fix it?: Yes (upstream fix exists)
  • Own content ratio: 0.67

Discussion

High-signal excerpts from the issue thread (symptoms, repros, edge-cases).

“I've seen this PR and it was a clue why it's not working on the shutdown process. How do you think, it's possible to add…”
@smdnv · 2021-02-19 · confirmation · source
“> I've seen this PR and it was a clue why it's not working on the shutdown process”
@auvipy · 2021-02-24 · confirmation · source
“To add to this, the problem isn't specific to recursive tasks it's just how i discover the behavior and breaks a pattern we use for…”
@gmjosack · 2019-08-01 · source
“Hi @auvipy, it's me again ) It looks like the problem is inside the celery.worker.loops.synloop and relevant only to the gevent/eventlet mode when celery use…”
@moaddib666 · 2024-10-10 · source

Failure Signature (Search String)

  • Completed tasks restore on shutdown causing duplicates for recursive tasks
  • * [x] I have included all related issues and possible duplicate issues
Copy-friendly signature
signature.txt
Failure Signature ----------------- Completed tasks restore on shutdown causing duplicates for recursive tasks * [x] I have included all related issues and possible duplicate issues

Error Message

Signature-only (no traceback captured)
error.txt
Error Message ------------- Completed tasks restore on shutdown causing duplicates for recursive tasks * [x] I have included all related issues and possible duplicate issues

Minimal Reproduction

repro.py
import time from celery import Celery app = Celery('tasks', broker='redis://localhost:6379/0') @app.task def recurse(num_times, sleep_seconds=1): print("Recursing... num_times={}".format(num_times)) if sleep_seconds > 0: time.sleep(sleep_seconds) if num_times > 0: recurse.delay(num_times-1) print("Recursed... num_times={}".format(num_times))

What Broke

Tasks are duplicated due to missing acks after worker shutdown, causing data inconsistency.

Why It Broke

The consumer cannot send ack for completed tasks after warm shutdown because the synloop isn't running

Fix Options (Details)

Option A — Upgrade to fixed release Safe default (recommended)

pip install celery==5.5.0

When NOT to use: This fix is not applicable if using a different worker pool or task acknowledgment strategy.

Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.

Option C — Workaround Temporary workaround

I found is to deactivate the restore_at_shutdown :

When NOT to use: This fix is not applicable if using a different worker pool or task acknowledgment strategy.

Use only if you cannot change versions today. Treat this as a stopgap and remove once upgraded.

Option D — Guard side-effects with OnceOnly Guardrail for side-effects

Mitigate duplicate external side-effects under retries/timeouts/agent loops by gating the operation before calling external systems.

  • Place OnceOnly between your code/agent and real side-effects (Stripe, emails, CRM, APIs).
  • Use a stable key per side-effect (e.g., customer_id + action + idempotency_key).
  • Fail-safe: configure fail-open vs fail-closed based on blast radius and spend risk.
  • This is most useful when retries/timeouts can re-trigger the same external call.
Show example snippet (optional)
onceonly.py
from onceonly import OnceOnly import os once = OnceOnly(api_key=os.environ["ONCEONLY_API_KEY"], fail_open=True) # Stable idempotency key per real side-effect. # Use a request id / job id / webhook delivery id / Stripe event id, etc. event_id = "evt_..." # replace key = f"stripe:webhook:{event_id}" res = once.check_lock(key=key, ttl=3600) if res.duplicate: return {"status": "already_processed"} # Safe to execute the side-effect exactly once. handle_event(event_id)

See OnceOnly SDK

When NOT to use: Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.

Fix reference: https://github.com/celery/celery/pull/9348

First fixed release: 5.5.0

Last verified: 2026-02-09. Validate in your environment.

Get updates

We publish verified fixes weekly. No spam.

Subscribe

When NOT to Use This Fix

  • This fix is not applicable if using a different worker pool or task acknowledgment strategy.
  • Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.

Verify Fix

verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

Did This Fix Work in Your Case?

Quick signal helps us prioritize which fixes to verify and improve.

Prevention

  • Capture the exact failing error string in logs and tests so you can reproduce via a minimal script.
  • Pin production dependencies and upgrade only with a reproducible test that hits the failing path.

Version Compatibility Table

VersionStatus
5.5.0 Fixed

Related Issues

No related fixes found.

Sources

We don’t republish the full GitHub discussion text. Use the links above for context.