redis-py Getting ReadOnly error when writing data to AWS

The Fix

pip install redis==7.1.0

Based on closed redis/redis-py issue #1764 · PR/commit linked

Production note: Watch p95/p99 latency and retry volume; timeouts can turn into retry storms and duplicate side-effects.

Jump to Verify Open PR/Commit

@@ -868,6 +868,7 @@ def __init__(
         decode_responses=False,
         retry_on_timeout=False,
+        retry_on_error=[],
         ssl=False,
         ssl_keyfile=None,

repro.py

from datetime import datetime
import redis
import time
import sys
import random

class Config:
    default_host = "localhost"
    master_host = "xxx.xx.0001.xxxx.cache.amazonaws.com"
    replica_host = "xxx.xx.0001.xxxx.cache.amazonaws.com"
    redis_db = 8
    socket_conn_timeout = 10
    request_delay_sec = 0.1

def get_redis_client():
    return redis.Redis(
        host=Config.master_host,
        db=Config.redis_db,
        socket_connect_timeout=Config.socket_conn_timeout,
    )

def get_random_key_value():
    val = time.time()
    key = "test_key_" + str(random.randint(0, 100))
    return key, val

r = get_redis_client()

r.flushdb()

flag = False

while True:
    try:
        if flag:
            print("beat:", time.time())
        r.set(*get_random_key_value())
        time.sleep(Config.request_delay_sec)
    except redis.RedisError as re:
        print(datetime.now(), "Error:", type(re), re)
        flag = True
        # sys.exit()
    except KeyboardInterrupt:
        print("Stopping loop execution")
        sys.exit()

verify

Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

fix.md

Option A — Upgrade to fixed release\npip install redis==7.1.0\nWhen NOT to use: Do not use this fix if your application cannot tolerate connection retries.\n\n

Why This Fix Works in Production

Trigger: Getting ReadOnly error when writing data to AWS elastic cache while the cluster is vertically scaled
Mechanism: The Redis client maintains a stale connection during AWS ElastiCache scaling, leading to ReadOnlyError
Why the fix works: Adds support for specifying error types with retry in the Redis client, addressing issues like ReadOnlyError during AWS ElastiCache scaling. (first fixed release: 7.1.0).

Production impact:

If left unfixed, this can cause silent data inconsistencies that propagate (bad cache entries, incorrect downstream decisions).

Why This Breaks in Prod

The Redis client maintains a stale connection during AWS ElastiCache scaling, leading to ReadOnlyError
Production symptom (often without a traceback): Getting ReadOnly error when writing data to AWS elastic cache while the cluster is vertically scaled

Proof / Evidence

GitHub issue: #1764
Fix PR: https://github.com/redis/redis-py/pull/1817
First fixed release: 7.1.0
Reproduced locally: No (not executed)
Last verified: 2026-02-09
Confidence: 0.85
Did this fix it?: Yes (upstream fix exists)
Own content ratio: 0.45

Discussion

High-signal excerpts from the issue thread (symptoms, repros, edge-cases).

Jump to Sources Open on GitHub

“@Akshit8 This looks like an issue in ElastiCache as opposed to redis-py”

@chayim · 2021-12-02 · source

“I'll try with redis-py 4. The above version of the script in Golang works fine, could it be related DNS resolution?”

@Akshit8 · 2021-12-03 · source

“I don't think so. Ultimately it's (Redis) ElastiCache returning that error. That phrase doesn't appear anywhere in the codebase, so we're surfacing the alert you…”

@chayim · 2021-12-03 · source

“Hey @Akshit8, I will try to reproduce it this week Could you share the golang script you're using too?”

@barshaul · 2021-12-05 · source

Failure Signature (Search String)

Getting ReadOnly error when writing data to AWS elastic cache while the cluster is vertically scaled
Here's an example script that simulates the functionality of my application. It also replicates the error that I have been facing.

Copy-friendly signature

signature.txt

Failure Signature
-----------------
Getting ReadOnly error when writing data to AWS elastic cache while the cluster is vertically scaled
Here's an example script that simulates the functionality of my application. It also replicates the error that I have been facing.

Error Message

Signature-only (no traceback captured)

error.txt

Error Message
-------------
Getting ReadOnly error when writing data to AWS elastic cache while the cluster is vertically scaled
Here's an example script that simulates the functionality of my application. It also replicates the error that I have been facing.

Minimal Reproduction

repro.py

from datetime import datetime
import redis
import time
import sys
import random

class Config:
    default_host = "localhost"
    master_host = "xxx.xx.0001.xxxx.cache.amazonaws.com"
    replica_host = "xxx.xx.0001.xxxx.cache.amazonaws.com"
    redis_db = 8
    socket_conn_timeout = 10
    request_delay_sec = 0.1

def get_redis_client():
    return redis.Redis(
        host=Config.master_host,
        db=Config.redis_db,
        socket_connect_timeout=Config.socket_conn_timeout,
    )

def get_random_key_value():
    val = time.time()
    key = "test_key_" + str(random.randint(0, 100))
    return key, val

r = get_redis_client()

r.flushdb()

flag = False

while True:
    try:
        if flag:
            print("beat:", time.time())
        r.set(*get_random_key_value())
        time.sleep(Config.request_delay_sec)
    except redis.RedisError as re:
        print(datetime.now(), "Error:", type(re), re)
        flag = True
        # sys.exit()
    except KeyboardInterrupt:
        print("Stopping loop execution")
        sys.exit()

What Broke

Application experiences intermittent ReadOnlyError during AWS ElastiCache vertical scaling.

Why It Broke

The Redis client maintains a stale connection during AWS ElastiCache scaling, leading to ReadOnlyError

Fix Options (Details)

Option A — Upgrade to fixed release Safe default (recommended)

pip install redis==7.1.0

When NOT to use: Do not use this fix if your application cannot tolerate connection retries.

Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.

Fix reference: https://github.com/redis/redis-py/pull/1817

First fixed release: 7.1.0

Last verified: 2026-02-09. Validate in your environment.

When NOT to Use This Fix

Do not use this fix if your application cannot tolerate connection retries.

Verify Fix

verify

Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

Did This Fix Work in Your Case?

Quick signal helps us prioritize which fixes to verify and improve.

Prevention

Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.

Version Compatibility Table

Version	Status
7.1.0	Fixed

Related Issues

No related fixes found.

Cluster: redis-py:data-consistency redis-py hub redis-py best practices All hubs All clusters

Related clusters: Configuration error Retry storm Race condition

Sources

We don’t republish the full GitHub discussion text. Use the links above for context.

redis-py Getting ReadOnly error when writing data to AWS elastic cache while the cluster is (Fix)