The Fix
pip install celery==4.4.0rc5
Based on closed celery/celery issue #5483 · PR/commit linked
@@ -183,7 +183,7 @@ def close_database(self, **kwargs):
for conn in self._db.connections.all():
try:
- conn.close()
+ conn.close_if_unusable_or_obsolete()
except self.interface_errors:
def fix_django_db(**kwargs):
# Calling db.close() on some DB connections will cause the inherited DB
# conn to also get broken in the parent process so we need to remove it
# without triggering any network IO that close() might cause.
for c in django.db.connections.all():
if c and c.connection:
try:
os.close(c.connection.fileno())
except (AttributeError, OSError, TypeError,
django.db.InterfaceError):
pass
try:
c.close()
except django.db.InterfaceError:
pass
except django.db.DatabaseError as exc:
str_exc = str(exc)
if 'closed' not in str_exc and 'not connected' not in str_exc:
raise
celery.signals.worker_process_init.connect(fix_django_db)
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Option A — Upgrade to fixed release\npip install celery==4.4.0rc5\nWhen NOT to use: This fix is not suitable if the application relies on persistent DB connections across forks.\n\n
Why This Fix Works in Production
- Trigger: The first step is necessary so the second step doesn't actually close the db connections of the parent process that they where inherited from.
- Mechanism: The method close_if_unusable_or_obsolete does not always close Django DB connections after a fork
- Why the fix works: Fixes the issue where Celery doesn't re-use DB connections with Django when `CONN_MAX_AGE` is set, leading to high database load. (first fixed release: 4.4.0rc5).
- If left unfixed, this can cause silent data inconsistencies that propagate (bad cache entries, incorrect downstream decisions).
Why This Breaks in Prod
- The method close_if_unusable_or_obsolete does not always close Django DB connections after a fork
- Production symptom (often without a traceback): The first step is necessary so the second step doesn't actually close the db connections of the parent process that they where inherited from.
Proof / Evidence
- GitHub issue: #5483
- Fix PR: https://github.com/celery/celery/pull/4292
- First fixed release: 4.4.0rc5
- Reproduced locally: No (not executed)
- Last verified: 2026-02-09
- Confidence: 0.85
- Did this fix it?: Yes (upstream fix exists)
- Own content ratio: 0.50
Discussion
High-signal excerpts from the issue thread (symptoms, repros, edge-cases).
“@Jokairui you need to undo the changes of #4116. Until this is fixed in celery, you can either set CONN_MAX_AGE to 0 to avoid triggering…”
“so, what exactly should we deal with this 'mysql server has gone away' bug? @Chronial @auvipy”
Failure Signature (Search String)
- The first step is necessary so the second step doesn't actually close the db connections of the parent process that they where inherited from.
- Unfortunately, the second step was broken by #4292, with this change: https://github.com/celery/celery/pull/4292/files#diff-7da7bceb78d87096e818d34c80115d18L186.
Copy-friendly signature
Failure Signature
-----------------
The first step is necessary so the second step doesn't actually close the db connections of the parent process that they where inherited from.
Unfortunately, the second step was broken by #4292, with this change: https://github.com/celery/celery/pull/4292/files#diff-7da7bceb78d87096e818d34c80115d18L186.
Error Message
Signature-only (no traceback captured)
Error Message
-------------
The first step is necessary so the second step doesn't actually close the db connections of the parent process that they where inherited from.
Unfortunately, the second step was broken by #4292, with this change: https://github.com/celery/celery/pull/4292/files#diff-7da7bceb78d87096e818d34c80115d18L186.
Minimal Reproduction
def fix_django_db(**kwargs):
# Calling db.close() on some DB connections will cause the inherited DB
# conn to also get broken in the parent process so we need to remove it
# without triggering any network IO that close() might cause.
for c in django.db.connections.all():
if c and c.connection:
try:
os.close(c.connection.fileno())
except (AttributeError, OSError, TypeError,
django.db.InterfaceError):
pass
try:
c.close()
except django.db.InterfaceError:
pass
except django.db.DatabaseError as exc:
str_exc = str(exc)
if 'closed' not in str_exc and 'not connected' not in str_exc:
raise
celery.signals.worker_process_init.connect(fix_django_db)
What Broke
Django DB connections remain in an invalid state, causing high database load and potential timeouts.
Why It Broke
The method close_if_unusable_or_obsolete does not always close Django DB connections after a fork
Fix Options (Details)
Option A — Upgrade to fixed release Safe default (recommended)
pip install celery==4.4.0rc5
Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.
Option D — Guard side-effects with OnceOnly Guardrail for side-effects
Mitigate duplicate external side-effects under retries/timeouts/agent loops by gating the operation before calling external systems.
- Place OnceOnly between your code/agent and real side-effects (Stripe, emails, CRM, APIs).
- Use a stable key per side-effect (e.g., customer_id + action + idempotency_key).
- Fail-safe: configure fail-open vs fail-closed based on blast radius and spend risk.
- This does NOT fix data corruption; it only prevents duplicate side-effects.
Show example snippet (optional)
from onceonly import OnceOnly
import os
once = OnceOnly(api_key=os.environ["ONCEONLY_API_KEY"], fail_open=True)
# Stable idempotency key per real side-effect.
# Use a request id / job id / webhook delivery id / Stripe event id, etc.
event_id = "evt_..." # replace
key = f"stripe:webhook:{event_id}"
res = once.check_lock(key=key, ttl=3600)
if res.duplicate:
return {"status": "already_processed"}
# Safe to execute the side-effect exactly once.
handle_event(event_id)
Fix reference: https://github.com/celery/celery/pull/4292
First fixed release: 4.4.0rc5
Last verified: 2026-02-09. Validate in your environment.
When NOT to Use This Fix
- This fix is not suitable if the application relies on persistent DB connections across forks.
- Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.
Verify Fix
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Did This Fix Work in Your Case?
Quick signal helps us prioritize which fixes to verify and improve.
Prevention
- Capture the exact failing error string in logs and tests so you can reproduce via a minimal script.
- Pin production dependencies and upgrade only with a reproducible test that hits the failing path.
Version Compatibility Table
| Version | Status |
|---|---|
| 4.4.0rc5 | Fixed |
Related Issues
No related fixes found.
Sources
We don’t republish the full GitHub discussion text. Use the links above for context.