Learn/Rate Limiting

Rate Limiting

Protecting services from excessive traffic

Key Takeaways

✓Rate limiting protects your service from abuse and ensures fair resource allocation across clients
✓Use atomic Redis operations (INCR + EXPIRE or Lua scripts) to count requests without race conditions
✓Return standard headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) so clients can self-throttle
✓The 429 Too Many Requests status code with a Retry-After header tells clients exactly when to try again

What is Rate Limiting?

Rate limiting controls how many requests a client can make to your API within a time window. When a client exceeds the limit, the server rejects subsequent requests with a 429 status code until the window resets.

It's one of the most fundamental protection mechanisms in production APIs — without it, a single misbehaving client (or attacker) can overwhelm your service and degrade it for everyone.

Why It Matters

Without rate limiting, your API is vulnerable to:

Denial of service: Intentional or accidental traffic spikes that exhaust server resources
Unfair usage: One client consuming a disproportionate share of capacity
Cost overruns: Uncontrolled API calls driving up infrastructure costs
Cascading failures: Overloaded services failing and taking down dependent systems

Every major API (GitHub, Stripe, Twitter) implements rate limiting. It's expected behavior, not an inconvenience.

How It Works

The most common approach is the fixed window counter:

Define a window (e.g., 60 seconds) and a limit (e.g., 10 requests)
For each request, increment a counter keyed by the client identifier
If the counter exceeds the limit, reject with 429
When the window expires, the counter resets

The Concurrency Problem

A naive implementation with separate "read count, then increment" steps has a race condition. Ten concurrent requests all read "count = 9" and all pass the check, allowing 19 requests through a limit of 10.

The fix: use atomic operations. In Redis, INCR atomically increments and returns the new value in a single operation. Combine it with EXPIRE to auto-reset the window, or use a Lua script for guaranteed atomicity.

Response Headers

Standard rate limit headers help clients self-regulate:

X-RateLimit-Limit: The maximum number of requests allowed per window
X-RateLimit-Remaining: How many requests the client has left
X-RateLimit-Reset: Unix timestamp when the window resets
Retry-After: Seconds until the client should retry (on 429 responses)

Common Mistakes

Non-atomic counting: Reading and incrementing in separate operations allows concurrent requests to bypass the limit.
Missing headers: Without rate limit headers, clients can't adjust their behavior proactively.
Per-server counting: If you store counts in memory, each server has its own counter. Use a shared store like Redis.

Practice This Concept

Apply what you've learned by solving this challenge.

Rate Limiter