Learn/Graceful Shutdown

Graceful Shutdown

Stopping servers without dropping requests

Key Takeaways

✓SIGTERM is the standard signal for "please stop" — handle it to shut down cleanly instead of being force-killed
✓server.close() stops accepting new connections but lets in-flight requests finish
✓The shutdown order matters: stop server, wait for requests, close database, then exit
✓Health checks should return 503 during shutdown so load balancers stop sending traffic

What is Graceful Shutdown?

Graceful shutdown is the process of stopping a server cleanly — finishing work in progress, releasing resources, and exiting without dropping any requests or corrupting data. It's triggered by a termination signal (typically SIGTERM) sent by the operating system, container orchestrator, or deployment tool.

The opposite — an ungraceful shutdown — is what happens when a process is killed with SIGKILL or crashes: in-flight requests get dropped, database transactions are left half-complete, and connections leak.

Why It Matters

Servers are restarted constantly in production:

Deployments: Every code deploy rolls out new containers and terminates old ones
Scaling: Autoscalers add and remove instances based on load
Maintenance: Infrastructure updates require node draining
Failures: Health check failures trigger automatic restarts

If your server doesn't handle SIGTERM, every restart risks dropping user requests. At scale, with dozens of deploys per day, this adds up to significant data loss and user-facing errors.

How It Works

The shutdown sequence has a clear order:

Receive SIGTERM — register a handler with process.on('SIGTERM', ...)
Stop accepting connections — call server.close() to stop the TCP listener
Mark as unhealthy — return 503 from health checks so load balancers stop routing traffic
Wait for in-flight requests — server.close() takes a callback that fires when all connections drain
Close resources — shut down database pools, Redis connections, message queues
Exit cleanly — call process.exit(0)

The Grace Period

Container orchestrators give you a grace period (typically 30 seconds) between SIGTERM and SIGKILL. If your shutdown takes longer than that, the process gets force-killed. Design your shutdown to complete well within this window.

Common Mistakes

Not handling SIGTERM at all: The default Node.js behavior on SIGTERM is to exit immediately, dropping all in-flight requests.
Closing the database before the server: If you close the DB pool while requests are still in flight, those requests fail with connection errors.
Not updating health checks: If the health endpoint keeps returning 200 during shutdown, the load balancer keeps sending new requests to a dying server.

Practice This Concept

Apply what you've learned by solving this challenge.

Graceful Shutdown

Graceful Shutdown

Key Takeaways

What is Graceful Shutdown?

Why It Matters

How It Works

The Grace Period

Common Mistakes

Practice This Concept

Further Reading