Cache Stampede
Also called thundering herd. When a popular cached item expires, hundreds or thousands of simultaneous requests hit the origin at once because every request sees a miss. Can overwhelm the origin server and cascade into a full outage.
Full Explanation
A cache stampede happens at the worst possible moment. Your most popular asset expires, and suddenly every request that would have been a cache hit becomes a cache miss. If your site gets 10,000 requests per second for that asset, your origin just got hit with 10,000 simultaneous requests. Most origins can't handle that.
The fix is request coalescing (also called request collapsing). When the first request triggers a cache miss, the CDN holds all subsequent requests for the same resource and serves them all from the single origin response. Varnish, Nginx, and most commercial CDNs support this.
Another defense is stale-while-revalidate: serve the expired content while fetching a fresh copy in the background. Users get a fast (slightly stale) response, and the origin only handles one revalidation request. Combining both techniques makes stampedes nearly impossible.
Try the interactive Cache Invalidation & Purging animation in the course to see what happens when a popular cached object expires simultaneously across multiple edge nodes.
Examples
Varnish request coalescing (built-in, but tunable):
# varnish default.vcl - coalescing is automatic
# But you can control grace period for stale serving
sub vcl_backend_response {
# Keep stale copy for 1 hour after TTL expires
set beresp.grace = 1h;
}
sub vcl_hit {
# Serve stale while revalidating in background
if (obj.ttl <= 0s && obj.grace > 0s) {
return (deliver);
}
}
Nginx proxy cache lock (request coalescing):
proxy_cache_lock on;
proxy_cache_lock_timeout 5s;
proxy_cache_lock_age 5s;
# Combined with stale-while-revalidate
proxy_cache_use_stale updating;
proxy_cache_background_update on;
Video Explanation
Frequently Asked Questions
Also called thundering herd. When a popular cached item expires, hundreds or thousands of simultaneous requests hit the origin at once because every request sees a miss. Can overwhelm the origin server and cascade into a full outage.
Varnish request coalescing (built-in, but tunable):
# varnish default.vcl - coalescing is automatic
# But you can control grace period for stale serving
sub vcl_backend_response {
# Keep stale copy for 1 hour after TTL expires
set beresp.grace = 1h;
}
sub vcl_hit {
# Serve stale while revalidating in background
if (obj.ttl <= 0s && obj.grace > 0s) {
return (deliver);
}
}
Nginx proxy cache lock (request coalescing):
proxy_cache_lock on;
proxy_cache_lock_timeout 5s;
proxy_cache_lock_age 5s;
# Combined with stale-while-revalidate
proxy_cache_use_stale updating;
proxy_cache_background_update on;
Related CDN concepts include:
- Origin Shield — A mid-tier cache layer that sits between edge servers and the origin. Aggregates cache misses …
- Request Coalescing — A CDN feature that batches multiple simultaneous requests for the same uncached content into a …
- stale-while-revalidate — Allows caches to serve stale content immediately while fetching a fresh copy in the background. …