Celery Redis result backend connections leak (Fix)

The Fix

pip install celery==5.3.0b2

Based on closed celery/celery issue #6819 · PR/commit linked

Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.

Jump to Verify Open PR/Commit

@@ -229,6 +229,7 @@ def __init__(self, main=None, loader=None, backend=None,
 
         self._local = threading.local()
+        self._backend_cache = None
 
         self.clock = LamportClock()

repro.py

from django.http import HttpResponse

from redis_leak.celery import debug_task
import threading

_local = threading.local()

def run_task(request):
    # res = debug_task.apply_async()
    # result = res.get()

    print("\n start request")
    print("get_ident", threading.get_ident())
    print("current_thread", threading.current_thread())

    if hasattr(_local,"attr"):
        print("_local  has attr", _local.attr)
    else:
        _local.attr = threading.get_ident()
        print("_local  has no attr, store", _local.attr, "as attr on _local ")

    return HttpResponse("foo")

verify

Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

fix.md

Option A — Upgrade to fixed release\npip install celery==5.3.0b2\nWhen NOT to use: This fix should not be used if the backend is not thread-safe.\n\nOption C — Workaround\nseems to be setting the ``timeout`` setting in redis.conf ([see here](https://redis.io/topics/clients#client-timeouts)) which eventually closes the stale connections from Redis's side.\nWhen NOT to use: This fix should not be used if the backend is not thread-safe.\n\n

Why This Fix Works in Production

Trigger: Redis result backend connections leak
Mechanism: The Redis backend was not properly managing connection lifecycles, leading to leaks
Why the fix works: Addresses the Redis connection leak issue by allowing users to opt-in to sharing the backend object across threads if the backend is thread safe. (first fixed release: 5.3.0b2).

Production impact:

If left unfixed, this can cause silent data inconsistencies that propagate (bad cache entries, incorrect downstream decisions).

Why This Breaks in Prod

The Redis backend was not properly managing connection lifecycles, leading to leaks
Production symptom (often without a traceback): Redis result backend connections leak

Proof / Evidence

GitHub issue: #6819
Fix PR: https://github.com/celery/celery/pull/8058
First fixed release: 5.3.0b2
Reproduced locally: No (not executed)
Last verified: 2026-02-09
Confidence: 0.75
Did this fix it?: Yes (upstream fix exists)
Own content ratio: 0.61

Discussion

High-signal excerpts from the issue thread (symptoms, repros, edge-cases).

Jump to Sources Open on GitHub

“@matusvalo there is a repro of the leak available here: https://github.com/LivePreso/redis-leak”

@bennullgraham · 2022-03-29 · repro detail · source

“I've confirmed the repro at https://github.com/LivePreso/redis-leak still holds with the new 5.2.0 release.”

@bennullgraham · 2021-11-16 · repro detail · source

“An update: I tried making oid and backend properties of celery shared between all threads by using if-lock-if but got into trouble as I think…”

@ronlut · 2021-06-30 · source

“G'day folks, we've been seeing what I think is the same issue”

@bennullgraham · 2021-11-03 · repro detail · source

Failure Signature (Search String)

Redis result backend connections leak
* [x] I have included all related issues and possible duplicate issues

Copy-friendly signature

signature.txt

Failure Signature
-----------------
Redis result backend connections leak
* [x] I have included all related issues and possible duplicate issues

Error Message

Signature-only (no traceback captured)

error.txt

Error Message
-------------
Redis result backend connections leak
* [x] I have included all related issues and possible duplicate issues

Minimal Reproduction

repro.py

from django.http import HttpResponse

from redis_leak.celery import debug_task
import threading

_local = threading.local()

def run_task(request):
    # res = debug_task.apply_async()
    # result = res.get()

    print("\n start request")
    print("get_ident", threading.get_ident())
    print("current_thread", threading.current_thread())

    if hasattr(_local,"attr"):
        print("_local  has attr", _local.attr)
    else:
        _local.attr = threading.get_ident()
        print("_local  has no attr, store", _local.attr, "as attr on _local ")

    return HttpResponse("foo")

What Broke

Connections to Redis were not being closed, causing resource exhaustion.

Why It Broke

The Redis backend was not properly managing connection lifecycles, leading to leaks

Fix Options (Details)

Option A — Upgrade to fixed release Safe default (recommended)

pip install celery==5.3.0b2

When NOT to use: This fix should not be used if the backend is not thread-safe.

Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.

Option C — Workaround Temporary workaround

seems to be setting the ``timeout`` setting in redis.conf ([see here](https://redis.io/topics/clients#client-timeouts)) which eventually closes the stale connections from Redis's side.

When NOT to use: This fix should not be used if the backend is not thread-safe.

Use only if you cannot change versions today. Treat this as a stopgap and remove once upgraded.

Option D — Guard side-effects with OnceOnly Guardrail for side-effects

Mitigate duplicate external side-effects under retries/timeouts/agent loops by gating the operation before calling external systems.

Place OnceOnly between your code/agent and real side-effects (Stripe, emails, CRM, APIs).
Use a stable key per side-effect (e.g., customer_id + action + idempotency_key).
Fail-safe: configure fail-open vs fail-closed based on blast radius and spend risk.
This does NOT fix data corruption; it only prevents duplicate side-effects.

Show example snippet (optional)

onceonly.py

from onceonly import OnceOnly
import os

once = OnceOnly(api_key=os.environ["ONCEONLY_API_KEY"], fail_open=True)

# Stable idempotency key per real side-effect.
# Use a request id / job id / webhook delivery id / Stripe event id, etc.
event_id = "evt_..."  # replace
key = f"stripe:webhook:{event_id}"

res = once.check_lock(key=key, ttl=3600)
if res.duplicate:
    return {"status": "already_processed"}

# Safe to execute the side-effect exactly once.
handle_event(event_id)

See OnceOnly SDK

When NOT to use: Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.

Fix reference: https://github.com/celery/celery/pull/8058

First fixed release: 5.3.0b2

Last verified: 2026-02-09. Validate in your environment.

When NOT to Use This Fix

This fix should not be used if the backend is not thread-safe.
Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.

Verify Fix

verify

Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

Did This Fix Work in Your Case?

Quick signal helps us prioritize which fixes to verify and improve.

Prevention

Capture the exact failing error string in logs and tests so you can reproduce via a minimal script.
Pin production dependencies and upgrade only with a reproducible test that hits the failing path.

Version Compatibility Table

Version	Status
5.3.0b2	Fixed

Related Issues

No related fixes found.

Cluster: celery:data-consistency Celery hub Celery best practices All hubs All clusters

Related clusters: Configuration error Duplicates Duplicate processing

Sources

We don’t republish the full GitHub discussion text. Use the links above for context.