Jump to solution
Verify

The Fix

pip install redis==7.1.0

Based on closed redis/redis-py issue #3560 · PR/commit linked

Production note: Watch p95/p99 latency and retry volume; timeouts can turn into retry storms and duplicate side-effects.

Jump to Verify Open PR/Commit
@@ -295,13 +295,18 @@ async def connect(self): await self.connect_check_health(check_health=True) - async def connect_check_health(self, check_health: bool = True): + async def connect_check_health( + self, check_health: bool = True, retry_socket_connect: bool = True
repro.py
def restart_container(delay, container): time.sleep(delay) container.restart() def problems(): for el in docker.from_env().containers.list(all=True): if "redis-node-1" == el.name: container_to_be_stopped = el sentinel = Sentinel( [("localhost", 36380), ("localhost", 36381)], socket_timeout=1, socket_connect_timeout=1, health_check_interval=5, retry=Retry(ExponentialBackoff(10, 0.5), 5), ) redis = sentinel.master_for("mymaster") pipeline = redis.pipeline() pipeline.hset("myhash","name","john doe") print("Shutdown master") container_to_be_stopped.stop() stop_thread = Thread(target=restart_container, args=(5, container_to_be_stopped)) stop_thread.start() resp = pipeline.execute() stop_thread.join() print(resp) # You can also try to hgetall("myhash") to verify that there is nothing written despite the resp being [1] - as in it wrote 1 field-value pair correctly
verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
fix.md
Option A — Upgrade to fixed release\npip install redis==7.1.0\nWhen NOT to use: Do not use this fix if the connection logic needs to maintain old master connections for specific use cases.\n\n

Why This Fix Works in Production

  • Trigger: `SentinelManagedConnection` delays new master detection and Risks data loss
  • Mechanism: The connection logic fails to detect a new master after a master goes down due to retry loop behavior
  • Why the fix works: Addresses the issue where `SentinelManagedConnection` fails to detect a new master after a master goes down, causing potential data loss. (first fixed release: 7.1.0).
Production impact:
  • If left unfixed, this can cause silent data inconsistencies that propagate (bad cache entries, incorrect downstream decisions).

Why This Breaks in Prod

  • The connection logic fails to detect a new master after a master goes down due to retry loop behavior
  • Production symptom (often without a traceback): `SentinelManagedConnection` delays new master detection and Risks data loss

Proof / Evidence

  • GitHub issue: #3560
  • Fix PR: https://github.com/redis/redis-py/pull/3601
  • First fixed release: 7.1.0
  • Reproduced locally: No (not executed)
  • Last verified: 2026-02-08
  • Confidence: 0.85
  • Did this fix it?: Yes (upstream fix exists)
  • Own content ratio: 0.50

Verified Execution

We executed the runnable minimal repro in a temporary environment and captured exit codes + logs.

  • Status: PASS
  • Ran: 2026-02-11T16:52:29Z
  • Package: redis
  • Fixed: 7.1.0
  • Mode: fixed_only
  • Outcome: ok
Logs
affected (exit=None)
fixed (exit=0)

Discussion

High-signal excerpts from the issue thread (symptoms, repros, edge-cases).

“Hi @petyaslavova, I opened a PR for this issue as well”
@ManelCoutinhoSensei · 2025-04-16 · source
“Hi @ManelCoutinhoSensei! We'll have a look at it shortly.”
@petyaslavova · 2025-03-17 · source

Failure Signature (Search String)

  • `SentinelManagedConnection` delays new master detection and Risks data loss
  • Beyond the unexpected delay, this behavior can cause data integrity issues:
Copy-friendly signature
signature.txt
Failure Signature ----------------- `SentinelManagedConnection` delays new master detection and Risks data loss Beyond the unexpected delay, this behavior can cause data integrity issues:

Error Message

Signature-only (no traceback captured)
error.txt
Error Message ------------- `SentinelManagedConnection` delays new master detection and Risks data loss Beyond the unexpected delay, this behavior can cause data integrity issues:

Minimal Reproduction

repro.py
def restart_container(delay, container): time.sleep(delay) container.restart() def problems(): for el in docker.from_env().containers.list(all=True): if "redis-node-1" == el.name: container_to_be_stopped = el sentinel = Sentinel( [("localhost", 36380), ("localhost", 36381)], socket_timeout=1, socket_connect_timeout=1, health_check_interval=5, retry=Retry(ExponentialBackoff(10, 0.5), 5), ) redis = sentinel.master_for("mymaster") pipeline = redis.pipeline() pipeline.hset("myhash","name","john doe") print("Shutdown master") container_to_be_stopped.stop() stop_thread = Thread(target=restart_container, args=(5, container_to_be_stopped)) stop_thread.start() resp = pipeline.execute() stop_thread.join() print(resp) # You can also try to hgetall("myhash") to verify that there is nothing written despite the resp being [1] - as in it wrote 1 field-value pair correctly

What Broke

This can lead to data integrity issues and potential data loss during failover.

Why It Broke

The connection logic fails to detect a new master after a master goes down due to retry loop behavior

Fix Options (Details)

Option A — Upgrade to fixed release Safe default (recommended)

pip install redis==7.1.0

When NOT to use: Do not use this fix if the connection logic needs to maintain old master connections for specific use cases.

Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.

Fix reference: https://github.com/redis/redis-py/pull/3601

First fixed release: 7.1.0

Last verified: 2026-02-08. Validate in your environment.

Get updates

We publish verified fixes weekly. No spam.

Subscribe

When NOT to Use This Fix

  • Do not use this fix if the connection logic needs to maintain old master connections for specific use cases.

Verify Fix

verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

Did This Fix Work in Your Case?

Quick signal helps us prioritize which fixes to verify and improve.

Prevention

  • Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
  • Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.

Version Compatibility Table

VersionStatus
7.1.0 Fixed

Related Issues

No related fixes found.

Sources

We don’t republish the full GitHub discussion text. Use the links above for context.