The Fix
pip install redis==7.1.0
Based on closed redis/redis-py issue #3560 · PR/commit linked
Production note: Watch p95/p99 latency and retry volume; timeouts can turn into retry storms and duplicate side-effects.
@@ -295,13 +295,18 @@ async def connect(self):
await self.connect_check_health(check_health=True)
- async def connect_check_health(self, check_health: bool = True):
+ async def connect_check_health(
+ self, check_health: bool = True, retry_socket_connect: bool = True
def restart_container(delay, container):
time.sleep(delay)
container.restart()
def problems():
for el in docker.from_env().containers.list(all=True):
if "redis-node-1" == el.name:
container_to_be_stopped = el
sentinel = Sentinel(
[("localhost", 36380), ("localhost", 36381)],
socket_timeout=1,
socket_connect_timeout=1,
health_check_interval=5,
retry=Retry(ExponentialBackoff(10, 0.5), 5),
)
redis = sentinel.master_for("mymaster")
pipeline = redis.pipeline()
pipeline.hset("myhash","name","john doe")
print("Shutdown master")
container_to_be_stopped.stop()
stop_thread = Thread(target=restart_container, args=(5, container_to_be_stopped))
stop_thread.start()
resp = pipeline.execute()
stop_thread.join()
print(resp)
# You can also try to hgetall("myhash") to verify that there is nothing written despite the resp being [1] - as in it wrote 1 field-value pair correctly
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Option A — Upgrade to fixed release\npip install redis==7.1.0\nWhen NOT to use: Do not use this fix if the connection logic needs to maintain old master connections for specific use cases.\n\n
Why This Fix Works in Production
- Trigger: `SentinelManagedConnection` delays new master detection and Risks data loss
- Mechanism: The connection logic fails to detect a new master after a master goes down due to retry loop behavior
- Why the fix works: Addresses the issue where `SentinelManagedConnection` fails to detect a new master after a master goes down, causing potential data loss. (first fixed release: 7.1.0).
- If left unfixed, this can cause silent data inconsistencies that propagate (bad cache entries, incorrect downstream decisions).
Why This Breaks in Prod
- The connection logic fails to detect a new master after a master goes down due to retry loop behavior
- Production symptom (often without a traceback): `SentinelManagedConnection` delays new master detection and Risks data loss
Proof / Evidence
- GitHub issue: #3560
- Fix PR: https://github.com/redis/redis-py/pull/3601
- First fixed release: 7.1.0
- Reproduced locally: No (not executed)
- Last verified: 2026-02-08
- Confidence: 0.85
- Did this fix it?: Yes (upstream fix exists)
- Own content ratio: 0.50
Verified Execution
We executed the runnable minimal repro in a temporary environment and captured exit codes + logs.
- Status: PASS
- Ran: 2026-02-11T16:52:29Z
- Package: redis
- Fixed: 7.1.0
- Mode: fixed_only
- Outcome: ok
Logs
Discussion
High-signal excerpts from the issue thread (symptoms, repros, edge-cases).
“Hi @petyaslavova, I opened a PR for this issue as well”
“Hi @ManelCoutinhoSensei! We'll have a look at it shortly.”
Failure Signature (Search String)
- `SentinelManagedConnection` delays new master detection and Risks data loss
- Beyond the unexpected delay, this behavior can cause data integrity issues:
Copy-friendly signature
Failure Signature
-----------------
`SentinelManagedConnection` delays new master detection and Risks data loss
Beyond the unexpected delay, this behavior can cause data integrity issues:
Error Message
Signature-only (no traceback captured)
Error Message
-------------
`SentinelManagedConnection` delays new master detection and Risks data loss
Beyond the unexpected delay, this behavior can cause data integrity issues:
Minimal Reproduction
def restart_container(delay, container):
time.sleep(delay)
container.restart()
def problems():
for el in docker.from_env().containers.list(all=True):
if "redis-node-1" == el.name:
container_to_be_stopped = el
sentinel = Sentinel(
[("localhost", 36380), ("localhost", 36381)],
socket_timeout=1,
socket_connect_timeout=1,
health_check_interval=5,
retry=Retry(ExponentialBackoff(10, 0.5), 5),
)
redis = sentinel.master_for("mymaster")
pipeline = redis.pipeline()
pipeline.hset("myhash","name","john doe")
print("Shutdown master")
container_to_be_stopped.stop()
stop_thread = Thread(target=restart_container, args=(5, container_to_be_stopped))
stop_thread.start()
resp = pipeline.execute()
stop_thread.join()
print(resp)
# You can also try to hgetall("myhash") to verify that there is nothing written despite the resp being [1] - as in it wrote 1 field-value pair correctly
What Broke
This can lead to data integrity issues and potential data loss during failover.
Why It Broke
The connection logic fails to detect a new master after a master goes down due to retry loop behavior
Fix Options (Details)
Option A — Upgrade to fixed release Safe default (recommended)
pip install redis==7.1.0
Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.
Fix reference: https://github.com/redis/redis-py/pull/3601
First fixed release: 7.1.0
Last verified: 2026-02-08. Validate in your environment.
When NOT to Use This Fix
- Do not use this fix if the connection logic needs to maintain old master connections for specific use cases.
Verify Fix
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Did This Fix Work in Your Case?
Quick signal helps us prioritize which fixes to verify and improve.
Prevention
- Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
- Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.
Version Compatibility Table
| Version | Status |
|---|---|
| 7.1.0 | Fixed |
Related Issues
No related fixes found.
Sources
We don’t republish the full GitHub discussion text. Use the links above for context.