The Fix
pip install redis==7.1.1
Based on closed redis/redis-py issue #3555 · PR/commit linked
Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.
@@ -296,7 +296,14 @@ def set_parser(self, parser_class: Type[BaseParser]) -> None:
async def connect(self):
"""Connects to the Redis server if not already connected"""
- await self.connect_check_health(check_health=True)
+ # try once the socket connect with the handshake, retry the whole
+ # connect/handshake flow based on retry policy
ping_parts = self._command_packer.pack("PING")
for part in ping_parts:
sock.sendall(part)
response = sock.recv(7)
if not str_if_bytes(response).startswith("+PONG"):
raise OSError(f"Redis handshake failed: unexpected response {response!r}")
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Option A — Upgrade to fixed release\npip install redis==7.1.1\nWhen NOT to use: This fix is not suitable if the connection logic does not require retries.\n\n
Why This Fix Works in Production
- Trigger: Retry Mechanism Fails When Redis Container is Paused
- Mechanism: Adds retries for both the socket connection and the initial connection handshake, improving retry behavior when the Redis container is paused.
- Why the fix works: Adds retries for both the socket connection and the initial connection handshake, improving retry behavior when the Redis container is paused. (first fixed release: 7.1.1).
- If left unfixed, retry loops can amplify load and turn a small outage into a cascade (thundering herd).
Why This Breaks in Prod
- Production symptom (often without a traceback): Retry Mechanism Fails When Redis Container is Paused
Proof / Evidence
- GitHub issue: #3555
- Fix PR: https://github.com/redis/redis-py/pull/3863
- First fixed release: 7.1.1
- Reproduced locally: No (not executed)
- Last verified: 2026-02-08
- Confidence: 0.80
- Did this fix it?: Yes (upstream fix exists)
- Own content ratio: 0.55
Discussion
High-signal excerpts from the issue thread (symptoms, repros, edge-cases).
“Hi! We would have a look on this in meantime”
“Hey @vladvildanov , I opened a PR with a solution to problem mentioned in the comment. if you want I can add you as a…”
“Hi @ManelCoutinhoSensei, can you provide small example how you end up with the provided log of the performed retries? I tried to reproduce it, and…”
“Following up on this issue, I've noticed that it also occurs when a container is still booting up in the background, making the port available…”
Failure Signature (Search String)
- Retry Mechanism Fails When Redis Container is Paused
- When the Redis container is **paused** (not stopped), the connection attempt should fail, triggering the retry mechanism. The retry number should increase monotonically until the
Copy-friendly signature
Failure Signature
-----------------
Retry Mechanism Fails When Redis Container is Paused
When the Redis container is **paused** (not stopped), the connection attempt should fail, triggering the retry mechanism. The retry number should increase monotonically until the specified maximum number of retries is reached.
Error Message
Signature-only (no traceback captured)
Error Message
-------------
Retry Mechanism Fails When Redis Container is Paused
When the Redis container is **paused** (not stopped), the connection attempt should fail, triggering the retry mechanism. The retry number should increase monotonically until the specified maximum number of retries is reached.
Minimal Reproduction
ping_parts = self._command_packer.pack("PING")
for part in ping_parts:
sock.sendall(part)
response = sock.recv(7)
if not str_if_bytes(response).startswith("+PONG"):
raise OSError(f"Redis handshake failed: unexpected response {response!r}")
What Broke
Retry mechanism gets stuck at the first attempt, repeating indefinitely.
Fix Options (Details)
Option A — Upgrade to fixed release Safe default (recommended)
pip install redis==7.1.1
Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.
Fix reference: https://github.com/redis/redis-py/pull/3863
First fixed release: 7.1.1
Last verified: 2026-02-08. Validate in your environment.
When NOT to Use This Fix
- This fix is not suitable if the connection logic does not require retries.
Verify Fix
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Did This Fix Work in Your Case?
Quick signal helps us prioritize which fixes to verify and improve.
Prevention
- Add a TLS smoke test that performs a real handshake in CI (include CA bundle validation and hostname checks).
- Alert on handshake failures by error string and endpoint to catch cert/CA changes quickly.
- Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
- Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.
Version Compatibility Table
| Version | Status |
|---|---|
| 7.1.1 | Fixed |
Related Issues
No related fixes found.
Sources
We don’t republish the full GitHub discussion text. Use the links above for context.