The Fix
Upgrade to version 0.15.0 or later.
Based on closed Kludex/uvicorn issue #1066 · PR/commit linked
Production note: Most teams hit this during upgrades or environment changes. Roll out with a canary and smoke critical endpoints (health, OpenAPI/docs) before 100%.
@@ -1,8 +1,10 @@
import logging
import signal
+import sys
from typing import Any
from fastapi import FastAPI
app = FastAPI()
@app.on_event("startup")
async def startup():
raise Exception("error")
@app.get("/")
async def hello():
return {"msg": "hello world"}
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Option A — Upgrade to fixed release\nUpgrade to version 0.15.0 or later.\nWhen NOT to use: This fix is not applicable if you require non-zero exit codes for other error handling.\n\n
Why This Fix Works in Production
- Trigger: {"loglevel": "info", "bind": "0.0.0.0:8000", "graceful_timeout": 120, "timeout": 120, "keepalive": 5, "errorlog": "-", "accesslog": "-", "host": "0.0.0.0",…
- Mechanism: Gunicorn keeps booting workers due to zero exit code on startup failure
- Why the fix works: Fixes the issue where Gunicorn keeps booting workers when exceptions are raised at startup events by setting the exit code to 3 on failure. (first fixed release: 0.15.0).
- If left unfixed, the same config can fail only in production (env differences), causing startup failures or partial feature outages.
Why This Breaks in Prod
- Shows up under Python 3.7 in real deployments (not just unit tests).
- Gunicorn keeps booting workers due to zero exit code on startup failure
- Surfaces as: {"loglevel": "info", "bind": "0.0.0.0:8000", "graceful_timeout": 120, "timeout": 120, "keepalive": 5, "errorlog": "-", "accesslog": "-", "host": "0.0.0.0", "port": "8000"}
Proof / Evidence
- GitHub issue: #1066
- Fix PR: https://github.com/kludex/uvicorn/pull/1077
- First fixed release: 0.15.0
- Reproduced locally: No (not executed)
- Last verified: 2026-02-09
- Confidence: 0.75
- Did this fix it?: Yes (upstream fix exists)
- Own content ratio: 0.25
Discussion
High-signal excerpts from the issue thread (symptoms, repros, edge-cases).
“@vackosar Do you have a minimal reproducible example?”
“Oh”
“this one is tricky to solve, I dont know gunicorn enough to have a clear view on what it needs to not restart failed workers…”
“ok thanks for the layout here, very helpful would it be possible if you have time to check if this PR https://github.com/encode/uvicorn/pull/835 that exits the…”
Failure Signature (Search String)
- {"loglevel": "info", "bind": "0.0.0.0:8000", "graceful_timeout": 120, "timeout": 120, "keepalive": 5, "errorlog": "-", "accesslog": "-", "host": "0.0.0.0", "port": "8000"}
Error Message
Stack trace
Error Message
-------------
{"loglevel": "info", "bind": "0.0.0.0:8000", "graceful_timeout": 120, "timeout": 120, "keepalive": 5, "errorlog": "-", "accesslog": "-", "host": "0.0.0.0", "port": "8000"}
[2021-06-01 21:13:05 +0800] [40946] [INFO] Starting gunicorn 20.0.4
[2021-06-01 21:13:05 +0800] [40946] [INFO] Listening at: http://0.0.0.0:8000 (40946)
[2021-06-01 21:13:05 +0800] [40946] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2021-06-01 21:13:05 +0800] [40949] [INFO] Booting worker with pid: 40949
[2021-06-01 21:13:05 +0800] [40949] [INFO] Started server process [40949]
[2021-06-01 21:13:05 +0800] [40949] [INFO] Waiting for application startup.
[2021-06-01 21:13:05 +0800] [40949] [ERROR] Traceback (most recent call last):
File "./envs/test/lib/python3.7/site-packages/starlette/routing.py", line 526, in lifespan
async for item in self.lifespan_context(app):
File "./envs/test/lib/python3.7/site-packages/starlette/routing.py", line 467, in default_lifespan
await self.startup()
File "./envs/test/lib/python3.7/site-packages/starlette/routing.py", line 502, in startup
await handler()
File "./app/main.py", line 16, in startup
raise Exception("error")
Exception: error
[2021-06-01 21:13:05 +0800] [40949] [ERROR] Application startup failed. Exiting.
[2021-06-01 21:13:05 +0800] [40949] [INFO] Worker exiting (pid: 40949)
[2021-06-01 21:13:05 +0800] [40950] [INFO] Booting worker wit
... (truncated) ...
Stack trace
Error Message
-------------
[2021-06-08 23:44:11 +0800] [44375] [INFO] Starting gunicorn 20.1.0
[2021-06-08 23:44:11 +0800] [44375] [INFO] Listening at: http://0.0.0.0:8000 (44375)
[2021-06-08 23:44:11 +0800] [44375] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2021-06-08 23:44:11 +0800] [44378] [INFO] Booting worker with pid: 44378
[2021-06-08 23:44:11 +0800] [44378] [INFO] Started server process [44378]
[2021-06-08 23:44:11 +0800] [44378] [INFO] Waiting for application startup.
[2021-06-08 23:44:11 +0800] [44378] [ERROR] Traceback (most recent call last):
File "./envs/test/lib/python3.7/site-packages/starlette/routing.py", line 526, in lifespan
async for item in self.lifespan_context(app):
File "./envs/test/lib/python3.7/site-packages/starlette/routing.py", line 467, in default_lifespan
await self.startup()
File "./envs/test/lib/python3.7/site-packages/starlette/routing.py", line 502, in startup
await handler()
File "./app/main.py", line 16, in startup
raise Exception("error")
Exception: error
[2021-06-08 23:44:11 +0800] [44378] [ERROR] Application startup failed. Exiting.
[2021-06-08 23:44:11 +0800] [44378] [INFO] Worker exiting (pid: 44378)
[2021-06-08 23:44:11 +0800] [44375] [INFO] Shutting down: Master
[2021-06-08 23:44:11 +0800] [44375] [INFO] Reason: Worker failed to boot.
Stack trace
Error Message
-------------
[2021-06-09 13:53:03 +0800] [49141] [INFO] Starting gunicorn 20.1.0
[2021-06-09 13:53:03 +0800] [49141] [INFO] Listening at: http://0.0.0.0:8000 (49141)
[2021-06-09 13:53:03 +0800] [49141] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2021-06-09 13:53:03 +0800] [49144] [INFO] Booting worker with pid: 49144
[2021-06-09 13:53:04 +0800] [49144] [INFO] Started server process [49144]
[2021-06-09 13:53:04 +0800] [49144] [INFO] Waiting for application startup.
[2021-06-09 13:53:04 +0800] [49144] [ERROR] Traceback (most recent call last):
File "./envs/test/lib/python3.7/site-packages/starlette/routing.py", line 526, in lifespan
async for item in self.lifespan_context(app):
File "./envs/test/lib/python3.7/site-packages/starlette/routing.py", line 467, in default_lifespan
await self.startup()
File "./envs/test/lib/python3.7/site-packages/starlette/routing.py", line 502, in startup
await handler()
File "./app/main.py", line 16, in startup
raise Exception("error")
Exception: error
[2021-06-09 13:53:04 +0800] [49144] [ERROR] Application startup failed. Exiting.
[2021-06-09 13:53:04 +0800] [49144] [INFO] Worker exiting (pid: 49144)
wpid: 49144, status: 0, reexec_pid: 0
[2021-06-09 13:53:04 +0800] [49145] [INFO] Booting worker with pid: 49145
[2021-06-09 13:53:04 +0800] [49145] [INFO] Started server process [49145]
[2021-06-09 13:53:04 +0800] [49145] [INFO] Wa
... (truncated) ...
Minimal Reproduction
from fastapi import FastAPI
app = FastAPI()
@app.on_event("startup")
async def startup():
raise Exception("error")
@app.get("/")
async def hello():
return {"msg": "hello world"}
Environment
- Python: 3.7
What Broke
Workers continuously restart without handling startup exceptions, causing resource exhaustion.
Why It Broke
Gunicorn keeps booting workers due to zero exit code on startup failure
Fix Options (Details)
Option A — Upgrade to fixed release Safe default (recommended)
Upgrade to version 0.15.0 or later.
Use when you can deploy the upstream fix. It is usually lower-risk than long-lived workarounds.
Fix reference: https://github.com/kludex/uvicorn/pull/1077
First fixed release: 0.15.0
Last verified: 2026-02-09. Validate in your environment.
When NOT to Use This Fix
- This fix is not applicable if you require non-zero exit codes for other error handling.
Verify Fix
Re-run the minimal reproduction on your broken version, then apply the fix and re-run.
Did This Fix Work in Your Case?
Quick signal helps us prioritize which fixes to verify and improve.
Prevention
- Add a TLS smoke test that performs a real handshake in CI (include CA bundle validation and hostname checks).
- Alert on handshake failures by error string and endpoint to catch cert/CA changes quickly.
- Make timeouts explicit and test them (unit + integration) to avoid silent behavior changes.
- Instrument retries (attempt count + reason) and alert on spikes to catch dependency slowdowns.
Version Compatibility Table
| Version | Status |
|---|---|
| 0.15.0 | Fixed |
Related Issues
No related fixes found.
Sources
We don’t republish the full GitHub discussion text. Use the links above for context.