Web Cache Poisoning: Detection, Analysis & Prevention Guide
Table of Contents
1. Introduction to Web Cache Poisoning
Web cache poisoning is an attack where an adversary manipulates a web cache into serving malicious content to unsuspecting users. The attacker exploits the gap between what a cache uses as a key (typically the URL and a subset of headers) and what the backend uses to build the response. If the backend incorporates unkeyed inputs — headers, cookies, or query parameters that the cache ignores — the attacker can inject a payload that gets stored in the cache and served to every subsequent visitor.
Why Cache Poisoning Is Particularly Dangerous
Unlike most web attacks that target a single user, cache poisoning is a one-to-many amplification attack. A single poisoned response can be served to thousands or millions of users until the cache entry expires. The attacker doesn't need to maintain a persistent connection — the cache does the work of distributing the payload. This makes it ideal for mass XSS, credential harvesting, and defacement. High-profile targets include CDN-backed sites (Cloudflare, Akamai, Fastly), API gateways, and any application with reverse proxy caching.
In this guide, you'll learn how caches decide what to store and when, how attackers discover unkeyed inputs, the difference between cache poisoning and cache deception, how to review code for vulnerable patterns, and how to defend your application at every layer — from backend code to CDN configuration.
Web Cache Poisoning Attack Flow
What makes web cache poisoning different from most other web attacks?
2. Real-World Scenario
The Scenario: You're reviewing a web application served behind Cloudflare. The app uses a reverse proxy (Nginx) and includes dynamic elements in its HTML based on request headers. The marketing team recently added A/B testing that reads the X-Forwarded-Host header to construct asset URLs.
❌ Vulnerable: Unkeyed Header Reflected in Cached Response
1// Express middleware that sets asset URLs based on X-Forwarded-Host
2app.use((req, res, next) => {
3 // ❌ Trusts X-Forwarded-Host without validation
4 // This header is NOT part of the cache key
5 const host = req.headers['x-forwarded-host'] || req.headers.host;
6 res.locals.assetHost = `https://${host}/assets`;
7 next();
8});
9
10app.get('/', (req, res) => {
11 // ❌ Unkeyed header value is embedded directly in the HTML
12 res.send(`
13 <!DOCTYPE html>
14 <html>
15 <head>
16 <link rel="stylesheet" href="${res.locals.assetHost}/main.css">
17 <script src="${res.locals.assetHost}/app.js"></script>
18 </head>
19 <body>
20 <h1>Welcome to our site</h1>
21 </body>
22 </html>
23 `);
24});
25
26// Nginx cache config (upstream):
27// proxy_cache_key "$scheme$request_method$host$request_uri";
28// ❌ Cache key does NOT include X-Forwarded-Host
29// The poisoned response will be served to ALL usersThe Attack: The attacker sends a request with X-Forwarded-Host: evil.com. The backend generates HTML with <script src="https://evil.com/assets/app.js">. The cache stores this response keyed on the URL alone. Now every visitor to the homepage loads JavaScript from the attacker's server.
Impact
This is a stored XSS via cache poisoning. The attacker controls the JavaScript loaded by every visitor. They can: steal session cookies and authentication tokens, redirect users to phishing pages, inject cryptocurrency miners, deface the site, exfiltrate form data (including payment information). The payload persists until the cache entry expires or is purged — potentially hours or days.
In the scenario above, why does the X-Forwarded-Host header enable cache poisoning?
3. Cache Poisoning Mechanics
To understand cache poisoning, you need to understand how caches work. A cache stores responses indexed by a cache key. When a new request arrives, the cache computes the key and checks for a match. If found (cache hit), the stored response is returned without contacting the origin. If not (cache miss), the request is forwarded to the origin, and the response may be stored for future requests.
Cache Key Components
| Component | Typically Keyed? | Risk if Unkeyed |
|---|---|---|
| URL path | Yes | Low — always part of the key |
| Query string | Usually yes | Medium — some CDNs strip or ignore query parameters |
| Host header | Yes | Low — standard part of the key |
| X-Forwarded-Host | No | Critical — often reflected in responses for asset URLs, redirects, canonical tags |
| X-Forwarded-Scheme / X-Forwarded-Proto | No | High — can force HTTP→HTTPS redirect loops or inject mixed content |
| X-Original-URL / X-Rewrite-URL | No | Critical — can override the URL parsed by the backend while the cache keys on the original URL |
| Accept-Language | Varies | Medium — if response varies by language but cache ignores this header |
| Cookie | Usually no | High — if response varies by auth state but cache ignores cookies |
| User-Agent | Sometimes | Medium — mobile vs desktop response variations |
Cache Key vs. Response Influence — The Fundamental Mismatch
1# What the cache sees (the key):
2Cache Key = GET + https + example.com + /page
3
4# What the backend sees (the full request):
5GET /page HTTP/1.1
6Host: example.com
7X-Forwarded-Host: evil.com ← NOT in cache key, BUT used in response
8X-Forwarded-Proto: http ← NOT in cache key, BUT triggers redirect
9Cookie: session=abc123 ← NOT in cache key, BUT changes response content
10Accept-Language: en-US ← NOT in cache key, BUT selects response language
11
12# The gap between these two views IS the attack surface❌ Vulnerable: Multiple Unkeyed Inputs in a Django Application
1# settings.py
2SECURE_PROXY_SSL_HEADER = ('HTTP_X_FORWARDED_PROTO', 'https')
3USE_X_FORWARDED_HOST = True # ❌ Trusts X-Forwarded-Host from any source
4
5# views.py
6def homepage(request):
7 # ❌ X-Forwarded-Host controls the base URL for all assets
8 # Django's request.build_absolute_uri() uses it internally
9 context = {
10 'canonical_url': request.build_absolute_uri(),
11 'og_url': request.build_absolute_uri(),
12 'base_url': f"https://{request.get_host()}",
13 }
14 return render(request, 'home.html', context)
15
16# home.html template
17# <link rel="canonical" href="{{ canonical_url }}">
18# <meta property="og:url" content="{{ og_url }}">
19# <script src="{{ base_url }}/static/app.js"></script>
20#
21# If X-Forwarded-Host: evil.com is sent:
22# canonical_url = https://evil.com/
23# og_url = https://evil.com/
24# base_url = https://evil.com
25# script src = https://evil.com/static/app.js ← XSS via cache poisoningA CDN caches responses using the key: scheme + host + path + query string. The backend reads the Accept-Language header to serve localized content. A German-speaking attacker poisons the cache. What happens?
4. Unkeyed Input Exploitation
The core technique in cache poisoning is finding unkeyed inputs — request components that influence the response but are not part of the cache key. Attackers use tools like Param Miner (a Burp Suite extension) to systematically discover these inputs by sending requests with unique canary values in various headers and observing whether those values appear in the cached response.
❌ Vulnerable: Common Unkeyed Input Patterns
1// --- Pattern 1: X-Forwarded-Host in link/script tags ---
2app.get('/page', (req, res) => {
3 // ❌ X-Forwarded-Host is unkeyed but controls script source
4 const host = req.headers['x-forwarded-host'] || 'cdn.example.com';
5 res.send(`<script src="https://${host}/bundle.js"></script>`);
6});
7
8// --- Pattern 2: X-Forwarded-Scheme forcing redirect ---
9app.use((req, res, next) => {
10 // ❌ Unkeyed header triggers a redirect that gets cached
11 const proto = req.headers['x-forwarded-proto'] || req.headers['x-forwarded-scheme'];
12 if (proto === 'http') {
13 // Cache stores a 301 redirect — all users get redirected
14 return res.redirect(301, `https://${req.headers.host}${req.url}`);
15 }
16 next();
17});
18
19// --- Pattern 3: X-Original-URL path override ---
20// Some frameworks (e.g., older Symfony, IIS) support this header
21app.use((req, res, next) => {
22 // ❌ X-Original-URL overrides the parsed path
23 // Cache keys on /public-page but backend serves /admin/secret
24 if (req.headers['x-original-url']) {
25 req.url = req.headers['x-original-url'];
26 }
27 next();
28});
29
30// --- Pattern 4: Unkeyed query parameter ---
31// Some CDNs exclude certain query params from the cache key
32app.get('/search', (req, res) => {
33 // CDN config: cache key ignores "utm_*" and "cb" parameters
34 // ❌ But the app reflects "cb" (cache buster) in the response
35 const cb = req.query.cb || '';
36 res.send(`<script>var cacheBust = "${cb}";</script>`);
37 // Attacker: /search?cb=";alert(document.cookie)//
38 // CDN caches this for /search (cb is excluded from key)
39});✅ Secure: Validating and Restricting Header Usage
1import { URL } from 'url';
2
3const ALLOWED_HOSTS = new Set([
4 'www.example.com',
5 'cdn.example.com',
6 'static.example.com',
7]);
8
9// ✅ Validate X-Forwarded-Host against an allow-list
10function getValidatedHost(req) {
11 const forwardedHost = req.headers['x-forwarded-host'];
12 if (forwardedHost && ALLOWED_HOSTS.has(forwardedHost)) {
13 return forwardedHost;
14 }
15 return 'www.example.com';
16}
17
18// ✅ Never reflect unvalidated headers in responses
19app.get('/page', (req, res) => {
20 const host = getValidatedHost(req);
21 res.send(`<script src="https://${host}/bundle.js"></script>`);
22});
23
24// ✅ Use Vary header to tell caches about response-influencing headers
25app.get('/localized', (req, res) => {
26 const lang = req.headers['accept-language']?.split(',')[0] || 'en';
27 res.set('Vary', 'Accept-Language');
28 res.send(getLocalizedContent(lang));
29});
30
31// ✅ Reject unexpected override headers entirely
32app.use((req, res, next) => {
33 const dangerousHeaders = [
34 'x-original-url',
35 'x-rewrite-url',
36 'x-forwarded-prefix',
37 ];
38 for (const header of dangerousHeaders) {
39 if (req.headers[header]) {
40 delete req.headers[header];
41 }
42 }
43 next();
44});An application sets the Vary: Accept-Language header on its responses. How does this help prevent cache poisoning via the Accept-Language header?
5. Web Cache Deception
Web cache deception is the mirror image of cache poisoning. In poisoning, the attacker poisons the cache so all users get malicious content. In deception, the attacker tricks the cache into storing a victim's private response, then retrieves it themselves. The attacker doesn't inject a payload — they trick the victim into generating a cacheable response that contains their private data.
Cache Poisoning vs. Cache Deception
| Property | Cache Poisoning | Cache Deception |
|---|---|---|
| Attacker goal | Serve malicious content to all users | Steal a specific victim's private data |
| Who is affected? | All users who request the cached URL | The attacker (who retrieves the cached private data) |
| Attack direction | Attacker → Cache → All victims | Victim → Cache → Attacker |
| Payload location | In the request (unkeyed header) | In the response (victim's session data) |
| Cache behavior exploited | Cache stores attacker-influenced response | Cache stores private response as if it were public |
| Typical prerequisite | Unkeyed input reflected in response | Path confusion between origin and cache (e.g., path normalization differences) |
❌ Vulnerable: Web Cache Deception Attack
1# Step 1: Attacker crafts a URL that tricks the cache
2# The origin server serves /account (private, requires auth)
3# But the cache sees the .css extension and treats it as static/cacheable
4
5# Attacker sends this link to the victim (e.g., via email, chat):
6https://example.com/account/anything.css
7
8# Step 2: Victim clicks the link (they are logged in)
9# Origin: path normalization ignores "/anything.css" → serves /account
10# Cache: sees .css extension → caches the response as static content
11#
12# Origin response (private data):
13# HTTP/1.1 200 OK
14# Content-Type: text/html
15# <h1>Welcome, Alice</h1>
16# <p>Email: alice@company.com</p>
17# <p>API Key: sk-abc123...</p>
18#
19# Cache stores this keyed on: /account/anything.css
20
21# Step 3: Attacker requests the same URL (unauthenticated)
22GET /account/anything.css
23# Cache hit! Returns Alice's private account page
24# Attacker now has Alice's email, API key, session tokens, etc.✅ Secure: Preventing Cache Deception
1// ✅ Defense 1: Always set Cache-Control on private responses
2app.get('/account', requireAuth, (req, res) => {
3 res.set({
4 'Cache-Control': 'no-store, no-cache, must-revalidate, private',
5 'Pragma': 'no-cache',
6 'Vary': 'Cookie',
7 });
8 res.send(renderAccountPage(req.user));
9});
10
11// ✅ Defense 2: Strict path matching — reject unexpected path suffixes
12app.get('/account', requireAuth, (req, res) => {
13 // Only serve if the path is exactly /account — no trailing segments
14 if (req.path !== '/account' && req.path !== '/account/') {
15 return res.status(404).send('Not Found');
16 }
17 res.send(renderAccountPage(req.user));
18});
19
20// ✅ Defense 3: CDN configuration — only cache specific file types
21// Cloudflare Page Rule / Cache Rule example:
22// Match: *.example.com/assets/*
23// Cache Level: Cache Everything
24// Edge TTL: 1 day
25//
26// Everything else: default behavior (respect Cache-Control from origin)
27//
28// ❌ Do NOT use: "Cache Everything" on wildcard paths
29// ❌ Do NOT cache based on file extension alone
30
31// ✅ Defense 4: Validate Content-Type matches the extension
32app.use((req, res, next) => {
33 const originalSend = res.send.bind(res);
34 res.send = (body) => {
35 const ext = req.path.split('.').pop()?.toLowerCase();
36 const contentType = res.get('Content-Type') || '';
37
38 const extensionTypeMap = {
39 css: 'text/css',
40 js: 'application/javascript',
41 json: 'application/json',
42 png: 'image/png',
43 jpg: 'image/jpeg',
44 };
45
46 if (ext && ext in extensionTypeMap) {
47 if (!contentType.includes(extensionTypeMap[ext])) {
48 res.status(404);
49 return originalSend('Not Found');
50 }
51 }
52 return originalSend(body);
53 };
54 next();
55});An attacker sends a victim a link to https://app.com/account/profile.css. The origin serves the /account page (ignoring the suffix), and the CDN caches it because of the .css extension. What type of attack is this?
6. Vulnerable Patterns in Code
During code review, look for patterns where request headers, cookies, or other inputs that are not part of the cache key influence the response. These are the injection points for cache poisoning.
❌ Vulnerable: Framework-Level Patterns to Watch For
1// --- Next.js: headers() in cached pages ---
2// In Next.js App Router, pages can be statically cached
3// but still read headers at request time
4
5// app/page.tsx
6import { headers } from 'next/headers';
7
8export default async function Page() {
9 const headersList = await headers();
10 // ❌ Reading unkeyed header in a potentially cached page
11 const host = headersList.get('x-forwarded-host') || 'example.com';
12 const theme = headersList.get('x-theme') || 'light';
13
14 return (
15 <html data-theme={theme}>
16 <head>
17 {/* ❌ Unkeyed header reflected in a link tag */}
18 <link rel="canonical" href={`https://${host}/`} />
19 </head>
20 <body>
21 <script src={`https://${host}/analytics.js`} />
22 </body>
23 </html>
24 );
25}
26
27// --- Ruby on Rails: request.host in cached views ---
28# ❌ Rails uses X-Forwarded-Host to determine request.host
29# config/environments/production.rb
30# config.action_dispatch.trusted_proxies = :all # ❌ Trusts all proxies
31#
32# app/views/layouts/application.html.erb
33# <script src="<%= "https://#{request.host}/packs/app.js" %>"></script>
34#
35# If the view is fragment-cached or page-cached, the host value
36# from the first request is served to all subsequent users❌ Vulnerable: Poisoning via Error Pages and Redirects
1// --- Pattern: Cached error pages with reflected input ---
2app.use((err, req, res, next) => {
3 // ❌ Error page reflects the requested URL
4 // If error pages are cached (common with CDN "always online" features)
5 res.status(500).send(`
6 <h1>Something went wrong</h1>
7 <p>Error processing: ${req.url}</p>
8 <p>Host: ${req.headers.host}</p>
9 `);
10});
11
12// --- Pattern: Open redirect that gets cached ---
13app.get('/redirect', (req, res) => {
14 const target = req.query.url || '/';
15 // ❌ If the CDN caches 301/302 redirects, this redirect is served
16 // to all users — even if the CDN strips the query string from the key
17 res.redirect(301, target);
18});
19
20// --- Pattern: JSONP callback in cached API response ---
21app.get('/api/data', (req, res) => {
22 const callback = req.query.callback;
23 const data = JSON.stringify({ users: 100, revenue: 50000 });
24
25 if (callback) {
26 // ❌ If CDN caches /api/data without the callback param in the key
27 res.type('application/javascript');
28 res.send(`${callback}(${data})`);
29 // Attacker: /api/data?callback=stealData
30 // Cached JSONP response calls attacker's function
31 } else {
32 res.json(JSON.parse(data));
33 }
34});✅ Secure: Safe Patterns for Cached Responses
1// ✅ Pattern 1: Hardcode the canonical host — never trust headers
2const CANONICAL_HOST = process.env.CANONICAL_HOST || 'www.example.com';
3const CDN_HOST = process.env.CDN_HOST || 'cdn.example.com';
4
5app.get('/page', (req, res) => {
6 res.send(`
7 <link rel="canonical" href="https://${CANONICAL_HOST}/page">
8 <script src="https://${CDN_HOST}/bundle.js"></script>
9 `);
10});
11
12// ✅ Pattern 2: Separate dynamic and static cache rules
13// Cache-Control for static assets (safe to cache)
14app.use('/assets', express.static('public', {
15 maxAge: '1y',
16 immutable: true,
17 setHeaders: (res) => {
18 res.set('Cache-Control', 'public, max-age=31536000, immutable');
19 },
20}));
21
22// Cache-Control for dynamic pages (only cache if safe)
23app.get('/page', (req, res) => {
24 res.set('Cache-Control', 'public, max-age=300, s-maxage=600');
25 res.set('Vary', 'Accept-Encoding');
26 // Only vary on headers that are actually keyed by the CDN
27 res.send(renderPage());
28});
29
30// Cache-Control for private pages (never cache in shared caches)
31app.get('/dashboard', requireAuth, (req, res) => {
32 res.set('Cache-Control', 'private, no-store');
33 res.send(renderDashboard(req.user));
34});
35
36// ✅ Pattern 3: Validate JSONP callbacks
37const SAFE_CALLBACK = /^[a-zA-Z_$][a-zA-Z0-9_$]*$/;
38app.get('/api/data', (req, res) => {
39 const callback = req.query.callback;
40 if (callback && !SAFE_CALLBACK.test(callback)) {
41 return res.status(400).json({ error: 'Invalid callback name' });
42 }
43 // Also add callback to Vary or ensure CDN keys on it
44 if (callback) {
45 res.set('Cache-Control', 'private, no-store');
46 }
47 // ...
48});A Next.js application reads X-Forwarded-Host from headers() to build canonical URLs in a statically cached page. What is the risk?
7. Code Review Defenses
Cache Poisoning Code Review Principles
1) Never trust unvalidated headers for constructing URLs, redirects, or script sources in cached responses. 2) Hardcode canonical hosts from environment variables — do not derive them from request headers. 3) Set appropriate Cache-Control headers: private/no-store for authenticated pages, and explicit public directives only for truly static content. 4) Use the Vary header to declare all response-influencing headers so caches key on them. 5) Strip or reject dangerous override headers (X-Original-URL, X-Rewrite-URL, X-Forwarded-Prefix) at the edge. 6) Configure CDN cache rules based on explicit paths, not file extensions.
✅ Nginx: Secure Cache Configuration
1# ✅ Include all response-influencing headers in the cache key
2proxy_cache_key "$scheme$request_method$host$request_uri$http_accept_language";
3
4# ✅ Strip dangerous headers before forwarding to the origin
5proxy_set_header X-Original-URL "";
6proxy_set_header X-Rewrite-URL "";
7proxy_set_header X-Forwarded-Prefix "";
8
9# ✅ Only allow trusted values for X-Forwarded-Host
10map $http_x_forwarded_host $validated_forwarded_host {
11 "www.example.com" "www.example.com";
12 "api.example.com" "api.example.com";
13 default "www.example.com";
14}
15proxy_set_header X-Forwarded-Host $validated_forwarded_host;
16
17# ✅ Never cache responses with Set-Cookie (indicates a session)
18proxy_no_cache $upstream_http_set_cookie;
19proxy_cache_bypass $upstream_http_set_cookie;
20
21# ✅ Respect Cache-Control from origin
22proxy_cache_valid 200 301 302 10m;
23proxy_cache_use_stale error timeout updating;
24
25# ✅ Add a header so you can verify caching behavior
26add_header X-Cache-Status $upstream_cache_status always;✅ Express Middleware: Cache Safety Headers
1// ✅ Middleware that enforces safe cache headers
2function cacheControlMiddleware(req, res, next) {
3 // Default: don't cache anything
4 res.set('Cache-Control', 'no-store');
5
6 // Override per-route:
7 // - Static assets: public, long TTL
8 // - Public pages: short TTL with validation
9 // - Private pages: no-store (default)
10 next();
11}
12
13// ✅ Middleware to strip dangerous headers before they reach the app
14function stripDangerousHeaders(req, res, next) {
15 const headersToStrip = [
16 'x-original-url',
17 'x-rewrite-url',
18 'x-forwarded-prefix',
19 'x-host',
20 'x-forwarded-server',
21 ];
22
23 for (const header of headersToStrip) {
24 delete req.headers[header];
25 }
26
27 // ✅ Validate X-Forwarded-Host if present
28 const forwardedHost = req.headers['x-forwarded-host'];
29 if (forwardedHost) {
30 const allowedHosts = process.env.ALLOWED_HOSTS?.split(',') || [];
31 if (!allowedHosts.includes(forwardedHost)) {
32 delete req.headers['x-forwarded-host'];
33 }
34 }
35
36 next();
37}
38
39// ✅ Apply to all routes
40app.use(stripDangerousHeaders);
41app.use(cacheControlMiddleware);✅ Automated Cache Poisoning Tests
1describe('Cache poisoning prevention', () => {
2 const UNKEYED_HEADERS = [
3 ['X-Forwarded-Host', 'evil.com'],
4 ['X-Forwarded-Scheme', 'nothttps'],
5 ['X-Original-URL', '/admin'],
6 ['X-Rewrite-URL', '/admin'],
7 ['X-Forwarded-Proto', 'http'],
8 ['X-Forwarded-Prefix', '/evil'],
9 ['X-Host', 'evil.com'],
10 ];
11
12 for (const [header, value] of UNKEYED_HEADERS) {
13 it(`should not reflect ${header} in the response body`, async () => {
14 const response = await request(app)
15 .get('/')
16 .set(header, value);
17
18 // ✅ Verify the header value does NOT appear in the response
19 expect(response.text).not.toContain(value);
20 expect(response.text).not.toContain('evil.com');
21 });
22 }
23
24 it('should set Cache-Control: no-store on authenticated pages', async () => {
25 const response = await request(app)
26 .get('/dashboard')
27 .set('Authorization', 'Bearer valid_token');
28
29 expect(response.headers['cache-control']).toMatch(/no-store|private/);
30 });
31
32 it('should not cache responses with Set-Cookie', async () => {
33 const response = await request(app)
34 .post('/login')
35 .send({ username: 'user', password: 'pass' });
36
37 if (response.headers['set-cookie']) {
38 expect(response.headers['cache-control']).toMatch(/no-store|private/);
39 }
40 });
41
42 it('should not serve private content at path with static extension', async () => {
43 const response = await request(app)
44 .get('/account/anything.css')
45 .set('Authorization', 'Bearer valid_token');
46
47 // Should be 404, not the account page
48 expect(response.status).toBe(404);
49 });
50});Which of the following is the most effective defense against both cache poisoning and cache deception?