redis-py `SentinelManagedConnection` delays new master

The Fix

pip install redis==7.1.0

Based on closed redis/redis-py issue #3560 · PR/commit linked

Production note: Watch p95/p99 latency and retry volume; timeouts can turn into retry storms and duplicate side-effects.

Jump to Verify Open PR/Commit

@@ -295,13 +295,18 @@ async def connect(self):
         await self.connect_check_health(check_health=True)
 
-    async def connect_check_health(self, check_health: bool = True):
+    async def connect_check_health(
+        self, check_health: bool = True, retry_socket_connect: bool = True

repro.py

def restart_container(delay, container):
    time.sleep(delay)
    container.restart()

def problems():
    for el in docker.from_env().containers.list(all=True):
        if "redis-node-1" == el.name:
            container_to_be_stopped = el

    sentinel = Sentinel(
        [("localhost", 36380), ("localhost", 36381)],
        socket_timeout=1,
        socket_connect_timeout=1,
        health_check_interval=5,
        retry=Retry(ExponentialBackoff(10, 0.5), 5),
    )
    redis = sentinel.master_for("mymaster")


    pipeline = redis.pipeline()
    pipeline.hset("myhash","name","john doe")

    print("Shutdown master")
    container_to_be_stopped.stop()

    stop_thread = Thread(target=restart_container, args=(5, container_to_be_stopped))
    stop_thread.start()
    resp = pipeline.execute()

    stop_thread.join()

    print(resp)
    # You can also try to hgetall("myhash") to verify that there is nothing written despite the resp being [1] - as in it wrote 1 field-value pair correctly

verify

Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

fix.md

Option A — Upgrade to fixed release\npip install redis==7.1.0\nWhen NOT to use: Do not use this fix if the connection logic needs to maintain old master connections for specific use cases.\n\n

Why This Fix Works in Production

Trigger: `SentinelManagedConnection` delays new master detection and Risks data loss
Mechanism: The connection logic fails to detect a new master after a master goes down due to retry loop behavior
Why the fix works: Addresses the issue where `SentinelManagedConnection` fails to detect a new master after a master goes down, causing potential data loss. (first fixed release: 7.1.0).

Production impact:

If left unfixed, this can cause silent data inconsistencies that propagate (bad cache entries, incorrect downstream decisions).

Why This Breaks in Prod

The connection logic fails to detect a new master after a master goes down due to retry loop behavior
Production symptom (often without a traceback): `SentinelManagedConnection` delays new master detection and Risks data loss

Proof / Evidence

GitHub issue: #3560
Fix PR: https://github.com/redis/redis-py/pull/3601
First fixed release: 7.1.0
Reproduced locally: No (not executed)
Last verified: 2026-02-08
Confidence: 0.85
Did this fix it?: Yes (upstream fix exists)
Own content ratio: 0.50

Verified Execution

We executed the runnable minimal repro in a temporary environment and captured exit codes + logs.

Status: PASS
Ran: 2026-02-11T16:52:29Z
Package: redis
Fixed: 7.1.0
Mode: fixed_only
Outcome: ok

Logs

affected (exit=None)

fixed (exit=0)

Discussion

High-signal excerpts from the issue thread (symptoms, repros, edge-cases).

Jump to Sources Open on GitHub

“Hi @petyaslavova, I opened a PR for this issue as well”

@ManelCoutinhoSensei · 2025-04-16 · source

“Hi @ManelCoutinhoSensei! We'll have a look at it shortly.”

@petyaslavova · 2025-03-17 · source

Failure Signature (Search String)

`SentinelManagedConnection` delays new master detection and Risks data loss
Beyond the unexpected delay, this behavior can cause data integrity issues:

Copy-friendly signature

signature.txt

Failure Signature
-----------------
`SentinelManagedConnection` delays new master detection and Risks data loss
Beyond the unexpected delay, this behavior can cause data integrity issues:

Error Message

Signature-only (no traceback captured)

error.txt

Error Message
-------------
`SentinelManagedConnection` delays new master detection and Risks data loss
Beyond the unexpected delay, this behavior can cause data integrity issues:

Minimal Reproduction

repro.py

def restart_container(delay, container):
    time.sleep(delay)
    container.restart()

def problems():
    for el in docker.from_env().containers.list(all=True):
        if "redis-node-1" == el.name:
            container_to_be_stopped = el

    sentinel = Sentinel(
        [("localhost", 36380), ("localhost", 36381)],
        socket_timeout=1,
        socket_connect_timeout=1,
        health_check_interval=5,
        retry=Retry(ExponentialBackoff(10, 0.5), 5),
    )
    redis = sentinel.master_for("mymaster")


    pipeline = redis.pipeline()
    pipeline.hset("myhash","name","john doe")

    print("Shutdown master")
    container_to_be_stopped.stop()

    stop_thread = Thread(target=restart_container, args=(5, container_to_be_stopped))
    stop_thread.start()
    resp = pipeline.execute()

    stop_thread.join()

    print(resp)
    # You can also try to hgetall("myhash") to verify that there is nothing written despite the resp being [1] - as in it wrote 1 field-value pair correctly

What Broke

This can lead to data integrity issues and potential data loss during failover.

Why It Broke

The connection logic fails to detect a new master after a master goes down due to retry loop behavior

Fix Options (Details)

Option A — Upgrade to fixed release Safe default (recommended)

pip install redis==7.1.0

When NOT to use: Do not use this fix if the connection logic needs to maintain old master connections for specific use cases.

Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.

Fix reference: https://github.com/redis/redis-py/pull/3601

First fixed release: 7.1.0

Last verified: 2026-02-08. Validate in your environment.

When NOT to Use This Fix

Do not use this fix if the connection logic needs to maintain old master connections for specific use cases.

Verify Fix

verify

Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

Did This Fix Work in Your Case?

Quick signal helps us prioritize which fixes to verify and improve.

Prevention

Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.

Version Compatibility Table

Version	Status
7.1.0	Fixed

Related Issues

No related fixes found.

Cluster: redis-py:data-consistency redis-py hub redis-py best practices All hubs All clusters

Related clusters: Configuration error Retry storm Race condition

Sources

We don’t republish the full GitHub discussion text. Use the links above for context.

redis-py `SentinelManagedConnection` delays new master detection and Risks data loss (Fix)