Jump to solution
Verify

The Fix

Addresses the issue of connections being lost from the pool when a check operation is cancelled by handling CancelledError explicitly.

Based on closed psycopg/psycopg issue #1208 · PR/commit linked

Production note: Watch p95/p99 latency and retry volume; timeouts can turn into retry storms and duplicate side-effects.

Jump to Verify Open PR/Commit
@@ -7,6 +7,16 @@ ============================== +Future releases +--------------- +
repro.py
import asyncio from asyncio import CancelledError from unittest.mock import create_autospec, AsyncMock import psycopg import psycopg_pool from psycopg_pool import PoolTimeout DSN = "" TIMEOUT_POOL = False async def main(): pool: psycopg_pool.AsyncConnectionPool = psycopg_pool.AsyncConnectionPool( DSN, open=False, min_size=1, max_size=1, ) await pool.open() try: for i in range(2): orig_conn = await pool.getconn() await orig_conn.execute("select 1;") # fake connection based on the real one conn = create_autospec(psycopg.AsyncConnection) for attr, val in vars(orig_conn).items(): setattr(conn, attr, val) # simulate cancellation during rollback if TIMEOUT_POOL: conn.rollback = AsyncMock(side_effect=[CancelledError()]) try: await pool.putconn(conn) except BaseException as ex: print(f"CancelledError caught on iteration {i}: {ex!r}") except PoolTimeout as error: print(f"PoolTimeout caught: {error!r}") else: print("No PoolTimeout caught") finally: # await pool.close() pass if __name__ == "__main__": asyncio.run(main())
verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
fix.md
Option A — Apply the official fix\nAddresses the issue of connections being lost from the pool when a check operation is cancelled by handling CancelledError explicitly.\nWhen NOT to use: This fix should not be used if handling BaseException is undesirable.\n\n

Why This Fix Works in Production

  • Trigger: AsyncConnectionPool: connections can leak on CancelledError or (gRPC task cancellation)
  • Mechanism: Connections leak from the async pool when CancelledError is raised, not handled properly
Production impact:
  • If left unfixed, this can cause silent data inconsistencies that propagate (bad cache entries, incorrect downstream decisions).

Why This Breaks in Prod

  • Connections leak from the async pool when CancelledError is raised, not handled properly
  • Surfaces as: AsyncConnectionPool: connections can leak on CancelledError or (gRPC task cancellation)

Proof / Evidence

Discussion

High-signal excerpts from the issue thread (symptoms, repros, edge-cases).

“Thank you for the repro, amazing! I'll play with it and see which holes it drills in the pool, and will try to make it…”
@dvarrazzo · 2025-11-11 · repro detail · source
“Thank you for the report, it might be a manifestation od other issues that have been reported, such as in #1123. Are you able to…”
@dvarrazzo · 2025-11-10 · repro detail · source
“> Thank you for the repro, amazing! I'll play with it and see which holes it drills in the pool, and will try to make…”
@medikos · 2025-11-11 · repro detail · source
“> Thank you for the report, it might be a manifestation od other issues that have been reported, such as in #1123”
@medikos · 2025-11-10 · repro detail · source

Failure Signature (Search String)

  • AsyncConnectionPool: connections can leak on CancelledError or (gRPC task cancellation)

Error Message

Stack trace
error.txt
Error Message ------------- AsyncConnectionPool: connections can leak on CancelledError or (gRPC task cancellation)

Minimal Reproduction

repro.py
import asyncio from asyncio import CancelledError from unittest.mock import create_autospec, AsyncMock import psycopg import psycopg_pool from psycopg_pool import PoolTimeout DSN = "" TIMEOUT_POOL = False async def main(): pool: psycopg_pool.AsyncConnectionPool = psycopg_pool.AsyncConnectionPool( DSN, open=False, min_size=1, max_size=1, ) await pool.open() try: for i in range(2): orig_conn = await pool.getconn() await orig_conn.execute("select 1;") # fake connection based on the real one conn = create_autospec(psycopg.AsyncConnection) for attr, val in vars(orig_conn).items(): setattr(conn, attr, val) # simulate cancellation during rollback if TIMEOUT_POOL: conn.rollback = AsyncMock(side_effect=[CancelledError()]) try: await pool.putconn(conn) except BaseException as ex: print(f"CancelledError caught on iteration {i}: {ex!r}") except PoolTimeout as error: print(f"PoolTimeout caught: {error!r}") else: print("No PoolTimeout caught") finally: # await pool.close() pass if __name__ == "__main__": asyncio.run(main())

What Broke

Under high load, the pool stops handing out connections, leading to timeouts.

Why It Broke

Connections leak from the async pool when CancelledError is raised, not handled properly

Fix Options (Details)

Option A — Apply the official fix

Addresses the issue of connections being lost from the pool when a check operation is cancelled by handling CancelledError explicitly.

When NOT to use: This fix should not be used if handling BaseException is undesirable.

Fix reference: https://github.com/psycopg/psycopg/pull/1214

Last verified: 2026-02-09. Validate in your environment.

Get updates

We publish verified fixes weekly. No spam.

Subscribe

When NOT to Use This Fix

  • This fix should not be used if handling BaseException is undesirable.

Verify Fix

verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

Did This Fix Work in Your Case?

Quick signal helps us prioritize which fixes to verify and improve.

Prevention

  • Track RSS + object counts after deployments; alert on monotonic growth and GC pressure.
  • Add a long-running test that repeats the failing call path and asserts stable memory.
  • Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
  • Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.

Related Issues

No related fixes found.

Sources

We don’t republish the full GitHub discussion text. Use the links above for context.