Jump to solution
Verify

The Fix

pip install redis==7.1.0

Based on closed redis/redis-py issue #2579 · PR/commit linked

Production note: Watch p95/p99 latency and retry volume; timeouts can turn into retry storms and duplicate side-effects.

Jump to Verify Open PR/Commit
@@ -1385,10 +1385,16 @@ async def execute(self, raise_on_error: bool = True): try: - return await conn.retry.call_with_retry( - lambda: execute(conn, stack, raise_on_error), - lambda error: self._disconnect_raise_reset(conn, error),
repro.py
import asyncio import time from typing import Awaitable, List, TypeVar, cast from redis import asyncio as aioredis _T = TypeVar('_T') async def _task_1(redis: aioredis.Redis) -> str: async with redis as r: await cast(Awaitable[List[str]], r.blpop('key_1', timeout=10)) return 'task_1_completed' async def _task_2() -> str: await asyncio.sleep(1) return 'task_2_completed' async def _first_completed(*tasks: Awaitable[_T]) -> _T: done, pending = await asyncio.wait( [asyncio.ensure_future(task) for task in tasks], return_when=asyncio.FIRST_COMPLETED ) for task in pending: task.cancel() await asyncio.wait(pending) print(done) print(pending) return done.pop().result() async def _redis_rpush(redis: aioredis.Redis) -> None: async with redis as r: await cast(Awaitable[int], r.rpush('key_2', '')) async def _main() -> None: redis = aioredis.from_url('redis://localhost/3', encoding='utf-8', decode_responses=True) print(await _first_completed(_task_1(redis), _task_2())) rpush_start_time = time.time() await _redis_rpush(redis) rpush_end_time = time.time() print(f"rpush seconds: {rpush_end_time - rpush_start_time}") await cast(Awaitable[bool], redis.flushdb()) if __name__ == '__main__': asyncio.run(_main())
verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
fix.md
Option A — Upgrade to fixed release\npip install redis==7.1.0\nWhen NOT to use: This fix should not be used if the application relies on the previous blocking behavior.\n\n

Why This Fix Works in Production

  • Trigger: await cast(Awaitable[List[str]], r.blpop('key_1', timeout=10))
  • Mechanism: Aioredis cancellation of blocking operations does not work correctly due to an AsyncIO race condition
  • Why the fix works: Fixes an AsyncIO race condition that caused cancellation of blocking operations to not work correctly, leading to subsequent operations getting stuck. (first fixed release: 7.1.0).
Production impact:
  • If left unfixed, tail latency can spike under load and surface as timeouts/retries (amplifying incident impact).

Why This Breaks in Prod

  • Shows up under Python 3.9.16 in real deployments (not just unit tests).
  • Aioredis cancellation of blocking operations does not work correctly due to an AsyncIO race condition
  • Production symptom (often without a traceback): await cast(Awaitable[List[str]], r.blpop('key_1', timeout=10))

Proof / Evidence

  • GitHub issue: #2579
  • Fix PR: https://github.com/redis/redis-py/pull/2641
  • First fixed release: 7.1.0
  • Reproduced locally: No (not executed)
  • Last verified: 2026-02-07
  • Confidence: 0.85
  • Did this fix it?: Yes (upstream fix exists)
  • Own content ratio: 0.46

Discussion

High-signal excerpts from the issue thread (symptoms, repros, edge-cases).

“**Version**: redis-py 4.4.2, redis 7.0.8-1 **Platform**: Python 3.9.16 on Arch Linux **Description**: Aioredis cancellation of blocking operations, like blpop, does not work correctly. Operation seems cancelled, but any next operation, like”
Issue thread · issue description · source

Failure Signature (Search String)

  • await cast(Awaitable[List[str]], r.blpop('key_1', timeout=10))
Copy-friendly signature
signature.txt
Failure Signature ----------------- await cast(Awaitable[List[str]], r.blpop('key_1', timeout=10))

Error Message

Signature-only (no traceback captured)
error.txt
Error Message ------------- await cast(Awaitable[List[str]], r.blpop('key_1', timeout=10))

Minimal Reproduction

repro.py
import asyncio import time from typing import Awaitable, List, TypeVar, cast from redis import asyncio as aioredis _T = TypeVar('_T') async def _task_1(redis: aioredis.Redis) -> str: async with redis as r: await cast(Awaitable[List[str]], r.blpop('key_1', timeout=10)) return 'task_1_completed' async def _task_2() -> str: await asyncio.sleep(1) return 'task_2_completed' async def _first_completed(*tasks: Awaitable[_T]) -> _T: done, pending = await asyncio.wait( [asyncio.ensure_future(task) for task in tasks], return_when=asyncio.FIRST_COMPLETED ) for task in pending: task.cancel() await asyncio.wait(pending) print(done) print(pending) return done.pop().result() async def _redis_rpush(redis: aioredis.Redis) -> None: async with redis as r: await cast(Awaitable[int], r.rpush('key_2', '')) async def _main() -> None: redis = aioredis.from_url('redis://localhost/3', encoding='utf-8', decode_responses=True) print(await _first_completed(_task_1(redis), _task_2())) rpush_start_time = time.time() await _redis_rpush(redis) rpush_end_time = time.time() print(f"rpush seconds: {rpush_end_time - rpush_start_time}") await cast(Awaitable[bool], redis.flushdb()) if __name__ == '__main__': asyncio.run(_main())

Environment

  • Python: 3.9.16

What Broke

Subsequent operations get stuck after cancelling a blocking operation, leading to timeouts.

Why It Broke

Aioredis cancellation of blocking operations does not work correctly due to an AsyncIO race condition

Fix Options (Details)

Option A — Upgrade to fixed release Safe default (recommended)

pip install redis==7.1.0

When NOT to use: This fix should not be used if the application relies on the previous blocking behavior.

Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.

Fix reference: https://github.com/redis/redis-py/pull/2641

First fixed release: 7.1.0

Last verified: 2026-02-07. Validate in your environment.

Get updates

We publish verified fixes weekly. No spam.

Subscribe

When NOT to Use This Fix

  • This fix should not be used if the application relies on the previous blocking behavior.

Verify Fix

verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

Did This Fix Work in Your Case?

Quick signal helps us prioritize which fixes to verify and improve.

Prevention

  • Add a stress test that runs high-concurrency workloads and fails on thread dumps / blocked locks.
  • Enable watchdog dumps in prod (faulthandler, thread dump endpoint) to capture deadlocks quickly.
  • Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
  • Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.

Version Compatibility Table

VersionStatus
7.1.0 Fixed

Related Issues

No related fixes found.

Sources

We don’t republish the full GitHub discussion text. Use the links above for context.