Jump to solution
Verify

The Fix

pip install celery==5.2.2

Based on closed celery/celery issue #5928 · PR/commit linked

Production note: This usually shows up under retries/timeouts. Treat it as a side-effect risk until you can verify behavior with a canary + real traffic.

Jump to Verify Open PR/Commit
@@ -265,16 +265,7 @@ def __reduce__(self, args=(), kwargs=None): def _get_database(self): conn = self._get_connection() - db = conn[self.database_name] - if self.user and self.password: - source = self.options.get(
repro.py
from functools import partial from bson import json_util json_options=json_util.JSONOptions( json_mode=json_util.JSONMode.RELAXED, tz_aware=True, tzinfo=utc) json_dump = partial( json_util.dumps, ensure_ascii=False, json_options=json_options) json_load = partial(json_util.loads, json_options=json_options) register( 'mongo_json', encoder=(lambda obj: json_dump(obj).encode('utf-8')), decoder=json_load, content_type='application/x-mongo-json') app = Celery( 'remains_tasks', broker='pyamqp://...', backend = 'mongodb://...', include=[ # many modules ] ) app.conf.update( accept_content=[ 'mongo_json', 'application/x-mongo-json', 'json', 'application/json'], task_serializer='mongo_json', result_serializer='mongo_json', task_default_queue='remains_tasks', worker_prefetch_multiplier=1, worker_max_tasks_per_child=45, worker_hijack_root_logger=False, mongodb_backend_settings=dict(database='remains', compressors='snappy'), )
verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
fix.md
Option A — Upgrade to fixed release\npip install celery==5.2.2\nWhen NOT to use: Do not use this fix if you are relying on deprecated authentication methods in pymongo.\n\n

Why This Fix Works in Production

  • Trigger: KeyError: 'collection'
  • Mechanism: The MongoDB backend was not passing the correct authentication source to the MongoClient
  • Why the fix works: Upgrades the required pymongo version to 3.11.1 and removes deprecated database authentication, fixing the authentication failure issue. (first fixed release: 5.2.2).
Production impact:
  • If left unfixed, the same config can fail only in production (env differences), causing startup failures or partial feature outages.

Why This Breaks in Prod

  • Triggered by an upgrade/regression window: 5.3 breaks; 5.2.2 is the first fixed release.
  • The MongoDB backend was not passing the correct authentication source to the MongoClient
  • Surfaces as: KeyError: 'collection'

Proof / Evidence

  • GitHub issue: #5928
  • Fix PR: https://github.com/celery/celery/pull/7130
  • First fixed release: 5.2.2
  • Affected versions: 5.3
  • Reproduced locally: No (not executed)
  • Last verified: 2026-02-09
  • Confidence: 0.75
  • Did this fix it?: Yes (upstream fix exists)
  • Own content ratio: 0.33

Discussion

High-signal excerpts from the issue thread (symptoms, repros, edge-cases).

“This issue was fixed by @naomielst, Upgrade required pymongo version to 3.11.1 #7130 When will version **5.2.x** be released?”
@idanmantin · 2021-12-02 · confirmation · source
“Also for celery 4.4.4, kombu 4.6.10, pymongo 3.10.1”
@tonal · 2020-06-06 · source
“@naomielst this is related, you can have a look as well to get some idea”
@auvipy · 2021-11-04 · source
“we can ofcourse defer to 5.3 but if we can make it to 5.2 that would be great”
@auvipy · 2021-11-04 · source

Failure Signature (Search String)

  • KeyError: 'collection'

Error Message

Stack trace
error.txt
Error Message ------------- KeyError: 'collection' File "kombu/utils/objects.py", line 42, in __get__ return obj.__dict__[self.__name__] KeyError: 'database' File "kombu/utils/objects.py", line 42, in __get__ return obj.__dict__[self.__name__] OperationFailure: Authentication failed. File "billiard/pool.py", line 1791, in safe_apply_callback fun(*args, **kwargs) File "celery/worker/request.py", line 526, in on_failure self.task.backend.mark_as_failure( File "celery/backends/base.py", line 159, in mark_as_failure self.store_result(task_id, exc, state, File "celery/backends/base.py", line 406, in store_result self._store_result(task_id, result, state, traceback, File "celery/backends/mongodb.py", line 194, in _store_result self.collection.replace_one({'_id': task_id}, meta, upsert=True) File "kombu/utils/objects.py", line 44, in __get__ value = obj.__dict__[self.__name__] = self.__get(obj) File "celery/backends/mongodb.py", line 291, in collection collection = self.database[self.taskmeta_collection] File "kombu/utils/objects.py", line 44, in __get__ value = obj.__dict__[self.__name__] = self.__get(obj) File "celery/backends/mongodb.py", line 286, in database return self._get_database() File "celery/backends/mongodb.py", line 275, in _get_database if not db.authenticate(self.user, self.password, source=source): File "pymongo/database.p ... (truncated) ...

Minimal Reproduction

repro.py
from functools import partial from bson import json_util json_options=json_util.JSONOptions( json_mode=json_util.JSONMode.RELAXED, tz_aware=True, tzinfo=utc) json_dump = partial( json_util.dumps, ensure_ascii=False, json_options=json_options) json_load = partial(json_util.loads, json_options=json_options) register( 'mongo_json', encoder=(lambda obj: json_dump(obj).encode('utf-8')), decoder=json_load, content_type='application/x-mongo-json') app = Celery( 'remains_tasks', broker='pyamqp://...', backend = 'mongodb://...', include=[ # many modules ] ) app.conf.update( accept_content=[ 'mongo_json', 'application/x-mongo-json', 'json', 'application/json'], task_serializer='mongo_json', result_serializer='mongo_json', task_default_queue='remains_tasks', worker_prefetch_multiplier=1, worker_max_tasks_per_child=45, worker_hijack_root_logger=False, mongodb_backend_settings=dict(database='remains', compressors='snappy'), )

What Broke

Authentication failures occurred when attempting to write results to MongoDB, causing task failures.

Why It Broke

The MongoDB backend was not passing the correct authentication source to the MongoClient

Fix Options (Details)

Option A — Upgrade to fixed release Safe default (recommended)

pip install celery==5.2.2

When NOT to use: Do not use this fix if you are relying on deprecated authentication methods in pymongo.

Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.

Option D — Guard side-effects with OnceOnly Guardrail for side-effects

Mitigate duplicate external side-effects under retries/timeouts/agent loops by gating the operation before calling external systems.

  • Place OnceOnly between your code/agent and real side-effects (Stripe, emails, CRM, APIs).
  • Use a stable key per side-effect (e.g., customer_id + action + idempotency_key).
  • Fail-safe: configure fail-open vs fail-closed based on blast radius and spend risk.
Show example snippet (optional)
onceonly.py
from onceonly import OnceOnly import os once = OnceOnly(api_key=os.environ["ONCEONLY_API_KEY"], fail_open=True) # Stable idempotency key per real side-effect. # Use a request id / job id / webhook delivery id / Stripe event id, etc. event_id = "evt_..." # replace key = f"stripe:webhook:{event_id}" res = once.check_lock(key=key, ttl=3600) if res.duplicate: return {"status": "already_processed"} # Safe to execute the side-effect exactly once. handle_event(event_id)

See OnceOnly SDK

When NOT to use: Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.

Fix reference: https://github.com/celery/celery/pull/7130

First fixed release: 5.2.2

Last verified: 2026-02-09. Validate in your environment.

Get updates

We publish verified fixes weekly. No spam.

Subscribe

When NOT to Use This Fix

  • Do not use this fix if you are relying on deprecated authentication methods in pymongo.
  • Do not use this to hide logic bugs or data corruption. Use it to block duplicate external side-effects and enforce tool permissions/spend caps.

Verify Fix

verify
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.

Did This Fix Work in Your Case?

Quick signal helps us prioritize which fixes to verify and improve.

Prevention

  • Capture the exact failing error string in logs and tests so you can reproduce via a minimal script.
  • Pin production dependencies and upgrade only with a reproducible test that hits the failing path.

Version Compatibility Table

VersionStatus
5.3 Broken
5.2.2 Fixed

Related Issues

No related fixes found.

Sources

We don’t republish the full GitHub discussion text. Use the links above for context.