Skip to content
Learn/Graceful Shutdown

Graceful Shutdown

Stopping servers without dropping requests

Key Takeaways

  • SIGTERM is the standard signal for "please stop" — handle it to shut down cleanly instead of being force-killed
  • server.close() stops accepting new connections but lets in-flight requests finish
  • The shutdown order matters: stop server, wait for requests, close database, then exit
  • Health checks should return 503 during shutdown so load balancers stop sending traffic

What is Graceful Shutdown?

Graceful shutdown is the process of stopping a server cleanly — finishing work in progress, releasing resources, and exiting without dropping any requests or corrupting data. It's triggered by a termination signal (typically SIGTERM) sent by the operating system, container orchestrator, or deployment tool.

The opposite — an ungraceful shutdown — is what happens when a process is killed with SIGKILL or crashes: in-flight requests get dropped, database transactions are left half-complete, and connections leak.

Why It Matters

Servers are restarted constantly in production:

  • Deployments: Every code deploy rolls out new containers and terminates old ones
  • Scaling: Autoscalers add and remove instances based on load
  • Maintenance: Infrastructure updates require node draining
  • Failures: Health check failures trigger automatic restarts

If your server doesn't handle SIGTERM, every restart risks dropping user requests. At scale, with dozens of deploys per day, this adds up to significant data loss and user-facing errors.

How It Works

The shutdown sequence has a clear order:

  • Receive SIGTERM — register a handler with process.on('SIGTERM', ...)
  • Stop accepting connections — call server.close() to stop the TCP listener
  • Mark as unhealthy — return 503 from health checks so load balancers stop routing traffic
  • Wait for in-flight requestsserver.close() takes a callback that fires when all connections drain
  • Close resources — shut down database pools, Redis connections, message queues
  • Exit cleanly — call process.exit(0)

The Grace Period

Container orchestrators give you a grace period (typically 30 seconds) between SIGTERM and SIGKILL. If your shutdown takes longer than that, the process gets force-killed. Design your shutdown to complete well within this window.

Common Mistakes

  • Not handling SIGTERM at all: The default Node.js behavior on SIGTERM is to exit immediately, dropping all in-flight requests.
  • Closing the database before the server: If you close the DB pool while requests are still in flight, those requests fail with connection errors.
  • Not updating health checks: If the health endpoint keeps returning 200 during shutdown, the load balancer keeps sending new requests to a dying server.

Practice This Concept

Apply what you've learned by solving this challenge.

SystemTrials — Backend Engineering Practice That Tests Like Production