Jump to solution
Verify

The Fix

pip install celery==5.1.0b1

Based on closed celery/celery issue #6577 · PR/commit linked

Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.

Jump to Verify Open PR/Commit
@@ -928,7 +928,9 @@ def on_chord_part_return(self, request, state, result, **kwargs): try: with allow_join_result(): - ret = j(timeout=3.0, propagate=True) + ret = j( + timeout=app.conf.result_chord_join_timeout,
repro
this bug.
verify
Follow the reproduction steps, confirm the failure, apply the fix, and repeat the same steps to verify the behavior changes.
fix.md
Option A — Upgrade to fixed release\npip install celery==5.1.0b1\nWhen NOT to use: This fix should not be applied if the application requires a fixed timeout for chord joins.\n\n

Why This Fix Works in Production

  • Trigger: Chord join timeout is still hardcoded somewhere
  • Mechanism: Replaces the hardcoded `3.0` timeout with the `result_chord_join_timeout` value in the `on_chord_part_return` methods for base and redis result backends.
  • Why the fix works: Replaces the hardcoded `3.0` timeout with the `result_chord_join_timeout` value in the `on_chord_part_return` methods for base and redis result backends. (first fixed release: 5.1.0b1).
Production impact:
  • If left unfixed, the same config can fail only in production (env differences), causing startup failures or partial feature outages.

Why This Breaks in Prod

  • Production symptom (often without a traceback): Chord join timeout is still hardcoded somewhere

Proof / Evidence

  • GitHub issue: #6577
  • Fix PR: https://github.com/celery/celery/pull/6578
  • First fixed release: 5.1.0b1
  • Reproduced locally: No (not executed)
  • Last verified: 2026-02-09
  • Confidence: 0.85
  • Did this fix it?: Yes (upstream fix exists)
  • Own content ratio: 0.73

Discussion

High-signal excerpts from the issue thread (symptoms, repros, edge-cases).

“<!-- Please fill this template entirely and do not erase parts of it. We reserve the right to close without a response bug reports which are incomplete. --> # Checklist <!-- To check an item on the list replace [ ] with [x]. --> - [x] I hav”
Issue thread · issue description · source

Failure Signature (Search String)

  • Chord join timeout is still hardcoded somewhere
  • - [x] I have included all related issues and possible duplicate issues
Copy-friendly signature
signature.txt
Failure Signature ----------------- Chord join timeout is still hardcoded somewhere - [x] I have included all related issues and possible duplicate issues

Error Message

Signature-only (no traceback captured)
error.txt
Error Message ------------- Chord join timeout is still hardcoded somewhere - [x] I have included all related issues and possible duplicate issues

Minimal Reproduction

  1. this bug.

What Broke

Users experienced unexpected timeouts when joining chords in Celery tasks.

Fix Options (Details)

Option A — Upgrade to fixed release Safe default (recommended)

pip install celery==5.1.0b1

When NOT to use: This fix should not be applied if the application requires a fixed timeout for chord joins.

Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.

Option D — Guard side-effects with OnceOnly Guardrail for side-effects

Mitigate duplicate external side-effects under retries/timeouts/agent loops by gating the operation before calling external systems.

  • Place OnceOnly between your code/agent and real side-effects (Stripe, emails, CRM, APIs).
  • Use a stable key per side-effect (e.g., customer_id + action + idempotency_key).
  • Fail-safe: configure fail-open vs fail-closed based on blast radius and spend risk.
Show example snippet (optional)
onceonly.py
from onceonly import OnceOnly import os once = OnceOnly(api_key=os.environ["ONCEONLY_API_KEY"], fail_open=True) # Stable idempotency key per real side-effect. # Use a request id / job id / webhook delivery id / Stripe event id, etc. event_id = "evt_..." # replace key = f"stripe:webhook:{event_id}" res = once.check_lock(key=key, ttl=3600) if res.duplicate: return {"status": "already_processed"} # Safe to execute the side-effect exactly once. handle_event(event_id)

See OnceOnly SDK

When NOT to use: Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.

Fix reference: https://github.com/celery/celery/pull/6578

First fixed release: 5.1.0b1

Last verified: 2026-02-09. Validate in your environment.

Get updates

We publish verified fixes weekly. No spam.

Subscribe

When NOT to Use This Fix

  • This fix should not be applied if the application requires a fixed timeout for chord joins.
  • Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.

Verify Fix

verify
Follow the reproduction steps, confirm the failure, apply the fix, and repeat the same steps to verify the behavior changes.

Did This Fix Work in Your Case?

Quick signal helps us prioritize which fixes to verify and improve.

Prevention

  • Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
  • Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.

Version Compatibility Table

VersionStatus
5.1.0b1 Fixed

Related Issues

No related fixes found.

Sources

We don’t republish the full GitHub discussion text. Use the links above for context.