Jump to solution
Verify

The Fix

pip install celery==4.4.0rc5

Based on closed celery/celery issue #5138 · PR/commit linked

Jump to Verify Open PR/Commit
@@ -1203,11 +1203,14 @@ def freeze(self, _id=None, group_id=None, chord=None, header_result = self.tasks.freeze( parent_id=parent_id, root_id=root_id, chord=self.body) - bodyres = self.body.freeze(_id, root_id=root_id) + + body_result = self.body.freeze(
repro.py
from celery import Celery, group, chain, chord app = Celery(__name__, broker='redis://127.0.0.1:6379/0', backend='redis://127.0.0.1:6379/1') @app.task(name='t') def t(x): print x return sum(x) + 1 if isinstance(x, list) else x + 1 if __name__ == '__main__': tasks = group( chain(t.s(), t.s(), t.s()), chain(t.s(), t.s(), t.s()), ) tasks = group( chain(t.s(), tasks, t.s()), ) # tasks DONT run after chord callbacks tasks = chord(tasks, t.s()) tasks = chain(t.s(1), tasks, t.s(), t.s(), t.s()) tasks.apply_async()
verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
fix.md
Option A — Upgrade to fixed release\npip install celery==4.4.0rc5\nWhen NOT to use: Do not use if it changes public behavior or if the failure cannot be reproduced.\n\n

Why This Fix Works in Production

  • Trigger: .> celery exchange=celery(direct) key=celery
  • Mechanism: Chords in chains with groups were not functioning correctly due to improper handling of chord information
  • Why the fix works: Fixes an issue where chords in chains with groups were not functioning correctly, which partially resolves issue #5138. (first fixed release: 4.4.0rc5).
Production impact:
  • If left unfixed, this can cause silent data inconsistencies that propagate (bad cache entries, incorrect downstream decisions).

Why This Breaks in Prod

  • Chords in chains with groups were not functioning correctly due to improper handling of chord information
  • Production symptom (often without a traceback): .> celery exchange=celery(direct) key=celery

Proof / Evidence

  • GitHub issue: #5138
  • Fix PR: https://github.com/celery/celery/pull/5222
  • First fixed release: 4.4.0rc5
  • Reproduced locally: No (not executed)
  • Last verified: 2026-02-09
  • Confidence: 0.85
  • Did this fix it?: Yes (upstream fix exists)
  • Own content ratio: 0.57

Discussion

High-signal excerpts from the issue thread (symptoms, repros, edge-cases).

“See #5222 for a partial improvement of your problem, however it will works with my MR if you explicitly set the chord in the second…”
@the-glu · 2018-12-05 · source
“It seems that, this bug has been fix in 4.2. Would you please try in the master branch?”
@tothegump · 2018-10-24 · source
“We use version 4.2.1 now, let me try the latest”
@vincenthcui · 2018-10-24 · source
“@tothegump however, tasks don't run in master branch too. and it looks like master is now in version 4.2.0, which is lower than version 4.2.1…”
@vincenthcui · 2018-10-24 · source

Failure Signature (Search String)

  • .> celery exchange=celery(direct) key=celery
Copy-friendly signature
signature.txt
Failure Signature ----------------- .> celery exchange=celery(direct) key=celery

Error Message

Signature-only (no traceback captured)
error.txt
Error Message ------------- .> celery exchange=celery(direct) key=celery

Minimal Reproduction

repro.py
from celery import Celery, group, chain, chord app = Celery(__name__, broker='redis://127.0.0.1:6379/0', backend='redis://127.0.0.1:6379/1') @app.task(name='t') def t(x): print x return sum(x) + 1 if isinstance(x, list) else x + 1 if __name__ == '__main__': tasks = group( chain(t.s(), t.s(), t.s()), chain(t.s(), t.s(), t.s()), ) tasks = group( chain(t.s(), tasks, t.s()), ) # tasks DONT run after chord callbacks tasks = chord(tasks, t.s()) tasks = chain(t.s(1), tasks, t.s(), t.s(), t.s()) tasks.apply_async()

What Broke

Tasks do not run as expected, leading to incomplete execution of the task graph.

Why It Broke

Chords in chains with groups were not functioning correctly due to improper handling of chord information

Fix Options (Details)

Option A — Upgrade to fixed release Safe default (recommended)

pip install celery==4.4.0rc5

When NOT to use: Do not use if it changes public behavior or if the failure cannot be reproduced.

Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.

Option D — Guard side-effects with OnceOnly Guardrail for side-effects

Mitigate duplicate external side-effects under retries/timeouts/agent loops by gating the operation before calling external systems.

  • Place OnceOnly between your code/agent and real side-effects (Stripe, emails, CRM, APIs).
  • Use a stable key per side-effect (e.g., customer_id + action + idempotency_key).
  • Fail-safe: configure fail-open vs fail-closed based on blast radius and spend risk.
  • This does NOT fix data corruption; it only prevents duplicate side-effects.
Show example snippet (optional)
onceonly.py
from onceonly import OnceOnly import os once = OnceOnly(api_key=os.environ["ONCEONLY_API_KEY"], fail_open=True) # Stable idempotency key per real side-effect. # Use a request id / job id / webhook delivery id / Stripe event id, etc. event_id = "evt_..." # replace key = f"stripe:webhook:{event_id}" res = once.check_lock(key=key, ttl=3600) if res.duplicate: return {"status": "already_processed"} # Safe to execute the side-effect exactly once. handle_event(event_id)

See OnceOnly SDK

When NOT to use: Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.

Fix reference: https://github.com/celery/celery/pull/5222

First fixed release: 4.4.0rc5

Last verified: 2026-02-09. Validate in your environment.

Get updates

We publish verified fixes weekly. No spam.

Subscribe

When NOT to Use This Fix

  • Do not use if it changes public behavior or if the failure cannot be reproduced.
  • Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.

Verify Fix

verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

Did This Fix Work in Your Case?

Quick signal helps us prioritize which fixes to verify and improve.

Prevention

  • Capture the exact failing error string in logs and tests so you can reproduce via a minimal script.
  • Pin production dependencies and upgrade only with a reproducible test that hits the failing path.

Version Compatibility Table

VersionStatus
4.4.0rc5 Fixed

Related Issues

No related fixes found.

Sources

We don’t republish the full GitHub discussion text. Use the links above for context.