Request Coalescing
A CDN feature that batches multiple simultaneous requests for the same uncached content into a single origin request. Also called request collapsing. Prevents origin overload when popular content expires or during cache warming.
Full Explanation
Imagine a popular URL expires from cache and 1,000 users request it at the same time. Without request coalescing, the edge server sends all 1,000 requests to your origin simultaneously. Your origin gets slammed, response times spike, and you might even trigger rate limiting or a crash. This is the thundering herd problem, also called a cache stampede.
Request coalescing solves this by having the edge server send only ONE request to origin and hold the other 999 in a queue. When the origin responds, the edge caches the content and serves all 999 waiting requests from that single cached response. One origin hit instead of a thousand. Your origin barely notices anything happened.
Different CDN platforms use different names for this feature. Varnish calls it "request coalescing." Nginx calls it "proxy_cache_lock." Fastly and Cloudflare enable it by default. Apache Traffic Server calls it "read-while-writer." The concept is identical across all of them: serialize concurrent requests for the same uncached resource into a single upstream fetch.
# Nginx request coalescing configuration
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=cdn:10m;
server {
location / {
proxy_cache cdn;
proxy_cache_lock on; # Enable request coalescing
proxy_cache_lock_timeout 5s; # Max wait time before sending own request
proxy_cache_lock_age 5s; # Max age of lock before allowing new request
proxy_cache_use_stale updating; # Serve stale while revalidating
proxy_pass http://origin;
}
}
One thing to watch out for: the lock timeout. If the origin is slow and the first request takes too long, queued requests will time out and each send their own request to origin, defeating the purpose. Set proxy_cache_lock_timeout to match your expected origin response time. Combining coalescing with stale-while-revalidate gives you the best protection: serve the old cached version to everyone while one request fetches the fresh copy in the background.
Interactive Animation
Examples
Varnish VCL configuration for request coalescing:
sub vcl_backend_fetch {
# Varnish enables coalescing by default.
# When multiple clients request the same uncached URL,
# only the first request goes to the backend.
# Others wait for the response.
# You can control waiting behavior with timeouts:
set bereq.between_bytes_timeout = 10s;
set bereq.first_byte_timeout = 15s;
}
# Test coalescing with concurrent requests
# Send 50 simultaneous requests for the same URL
seq 50 | xargs -P50 -I{} curl -s -o /dev/null -w "%{http_code}\n" \
https://cdn.example.com/popular-page
# With coalescing: origin sees 1 request
# Without coalescing: origin sees 50 requests
Video Explanation
Frequently Asked Questions
A CDN feature that batches multiple simultaneous requests for the same uncached content into a single origin request. Also called request collapsing. Prevents origin overload when popular content expires or during cache warming.
Varnish VCL configuration for request coalescing:
sub vcl_backend_fetch {
# Varnish enables coalescing by default.
# When multiple clients request the same uncached URL,
# only the first request goes to the backend.
# Others wait for the response.
# You can control waiting behavior with timeouts:
set bereq.between_bytes_timeout = 10s;
set bereq.first_byte_timeout = 15s;
}
# Test coalescing with concurrent requests
# Send 50 simultaneous requests for the same URL
seq 50 | xargs -P50 -I{} curl -s -o /dev/null -w "%{http_code}\n" \
https://cdn.example.com/popular-page
# With coalescing: origin sees 1 request
# Without coalescing: origin sees 50 requests