The Fix
Addresses the issue of handling BaseExceptions in the Connection class by changing the exception handling to catch only Exceptions, ensuring that BaseExceptions do not lead to unintended disconnections.
Based on closed redis/redis-py issue #2499 · PR/commit linked
Production note: Watch p95/p99 latency and retry volume; timeouts can turn into retry storms and duplicate side-effects.
@@ -502,8 +502,6 @@ async def read_from_socket(
# return True to indicate that data was read.
return True
- except asyncio.CancelledError:
- raise
except (socket.timeout, asyncio.TimeoutError):
@pytest.fixture(scope="session")
def event_loop():
policy = asyncio.get_event_loop_policy()
loop = policy.new_event_loop()
yield loop
loop.close()
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Option A — Apply the official fix\nAddresses the issue of handling BaseExceptions in the Connection class by changing the exception handling to catch only Exceptions, ensuring that BaseExceptions do not lead to unintended disconnections.\nWhen NOT to use: This fix should not be applied if the application relies on BaseExceptions for critical error handling.\n\n
Why This Fix Works in Production
- Trigger: result = Redis.parse_response(
- Mechanism: Handling of BaseExceptions in the Connection class was not specific enough, leading to unintended disconnections
- If left unfixed, this can cause silent data inconsistencies that propagate (bad cache entries, incorrect downstream decisions).
Why This Breaks in Prod
- Shows up under Python 3.9 in real deployments (not just unit tests).
- Handling of BaseExceptions in the Connection class was not specific enough, leading to unintended disconnections
- Surfaces as: File "site-packages/redis/client.py", line 3401, in parse_response
Proof / Evidence
- GitHub issue: #2499
- Fix PR: https://github.com/redis/redis-py/pull/2104
- Reproduced locally: No (not executed)
- Last verified: 2026-02-12
- Confidence: 0.60
- Did this fix it?: Yes (upstream fix exists)
- Own content ratio: 0.42
Discussion
High-signal excerpts from the issue thread (symptoms, repros, edge-cases).
“Digging through history: - this was first reported in #360 in 2013 - then regressed in 3.0.0 - then again reported in #1128 and fixed…”
“> Yes, this is the reason why I removed the "can_read()" implementation from the Async code”
“> Feel free to take a look here for the impl that led to this error (to reproduce, simply uninstall hiredis when running pytest): Andrew-Chen-Wang/django-async-redis#5…”
“Ran the tests and didn't see the problem”
Failure Signature (Search String)
- result = Redis.parse_response(
Error Message
Stack trace
Error Message
-------------
File "site-packages/redis/client.py", line 3401, in parse_response
result = Redis.parse_response(
File "site-packages/redis/client.py", line 774, in parse_response
return self.response_callbacks[command_name](response, **options)
File "site-packages/redis/client.py", line 528, in <lambda>
'SET': lambda r: r and nativestr(r) == 'OK',
File "site-packages/redis/_compat.py", line 135, in nativestr
return x if isinstance(x, str) else x.decode('utf-8', 'replace')
AttributeError: 'int' object has no attribute 'decode'
Stack trace
Error Message
-------------
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File ".../lib/python3.9/site-packages/redis/commands/core.py", line 2512, in brpop
return self.execute_command("BRPOP", *keys)
File ".../lib/python3.9/site-packages/redis/client.py", line 1258, in execute_command
return conn.retry.call_with_retry(
File ".../lib/python3.9/site-packages/redis/retry.py", line 46, in call_with_retry
return do()
File ".../lib/python3.9/site-packages/redis/client.py", line 1259, in <lambda>
lambda: self._send_command_parse_response(
File ".../lib/python3.9/site-packages/redis/client.py", line 1235, in _send_command_parse_response
return self.parse_response(conn, command_name, **options)
File ".../lib/python3.9/site-packages/redis/client.py", line 1275, in parse_response
response = connection.read_response()
File ".../lib/python3.9/site-packages/redis/connection.py", line 812, in read_response
response = self._parser.read_response(disable_decoding=disable_decoding)
File ".../lib/python3.9/site-packages/redis/connection.py", line 318, in read_response
raw = self._buffer.readline()
File ".../lib/python3.9/site-packages/redis/connection.py", line 249, in readline
self._read_from_socket()
File ".../lib/python3.9/site-packages/redis/connection.py", line 192, in _read_from_socket
data = self._sock.recv(socket_read_size)
Keyboa
... (truncated) ...
Minimal Reproduction
@pytest.fixture(scope="session")
def event_loop():
policy = asyncio.get_event_loop_policy()
loop = policy.new_event_loop()
yield loop
loop.close()
Environment
- Python: 3.9
What Broke
Timeouts during socket I/O caused the connection to enter an undefined state, leading to parsing errors.
Why It Broke
Handling of BaseExceptions in the Connection class was not specific enough, leading to unintended disconnections
Fix Options (Details)
Option A — Apply the official fix
Addresses the issue of handling BaseExceptions in the Connection class by changing the exception handling to catch only Exceptions, ensuring that BaseExceptions do not lead to unintended disconnections.
Option D — Guard side-effects with OnceOnly Guardrail for side-effects
Mitigate duplicate external side-effects under retries/timeouts/agent loops by gating the operation before calling external systems.
- Place OnceOnly between your code/agent and real side-effects (Stripe, emails, CRM, APIs).
- Use a stable key per side-effect (e.g., customer_id + action + idempotency_key).
- Fail-safe: configure fail-open vs fail-closed based on blast radius and spend risk.
- This does NOT fix data corruption; it only prevents duplicate side-effects.
Show example snippet (optional)
from onceonly import OnceOnly
import os
once = OnceOnly(api_key=os.environ["ONCEONLY_API_KEY"], fail_open=True)
# Stable idempotency key per real side-effect.
# Use a request id / job id / webhook delivery id / Stripe event id, etc.
event_id = "evt_..." # replace
key = f"stripe:webhook:{event_id}"
res = once.check_lock(key=key, ttl=3600)
if res.duplicate:
return {"status": "already_processed"}
# Safe to execute the side-effect exactly once.
handle_event(event_id)
Fix reference: https://github.com/redis/redis-py/pull/2104
Last verified: 2026-02-12. Validate in your environment.
When NOT to Use This Fix
- This fix should not be applied if the application relies on BaseExceptions for critical error handling.
- Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.
Verify Fix
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Did This Fix Work in Your Case?
Quick signal helps us prioritize which fixes to verify and improve.
Prevention
- Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
- Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.
Related Issues
No related fixes found.
Sources
We don’t republish the full GitHub discussion text. Use the links above for context.