DOM Manipulation Attacks Code Review Guide
Table of Contents
Introduction
DOM manipulation attacks exploit client-side JavaScript that reads untrusted data and feeds it to a DOM API that interprets markup or executes code. The vulnerability lives entirely in the browser — the server may never see the payload at all — which is why server-side input validation is structurally incapable of stopping these bugs. If the sink is reachable from a source, the page is exploitable.
Traditional cross-site scripting is usually described as reflected (server echoes a parameter) or stored (server persists and later renders a payload). DOM-based attacks are a separate class: the unsafe transformation happens in front-end code. The payload can arrive via location.hash, a postMessage event, window.name, or localStorage, and never touch a server log, a WAF, or a backend template engine.
Once JavaScript runs in the victim's origin, the attacker owns the session. They can read cookies, exfiltrate tokens from localStorage, fire privileged XHRs as the user, modify DOM to phish credentials, and chain into CSRF or token theft. The blast radius is identical to stored XSS — the attack vector is just harder to see because it hides inside the client bundle.
The browser is an evaluator
Every time you hand a string to innerHTML, document.write, eval, setTimeout(string), or location.href, you are asking the browser to parse or execute that string. These APIs are eval-shaped. A reviewer should treat every unsafe sink like a call into a language interpreter and demand proof that the input is trusted.
What distinguishes DOM-based XSS from reflected or stored XSS?
The DOM Attack Surface
Every DOM attack fits a source→sink shape. A source is any browser API that returns attacker-controllable data. A sink is any API that interprets that data as markup, URL, or code. Your code review job is to find unbroken paths from one to the other.
DOM Attack Surface: Sources Flow Into Sinks
location.hashlocation.searchdocument.referrerwindow.namepostMessage event.datalocalStorageform inputs
el.innerHTML =el.outerHTML =document.write()eval()new Function()setTimeout(str)location.href =
Reviewer heuristic: grep for sinks first, then walk upstream through assignments and function arguments until you reach a source. Every unbroken path is a finding.
Sources are easy to enumerate — they are browser-provided globals and properties. Anything the attacker can influence without authentication is in scope: the URL (hash, search, pathname), referrer, window name, message events, storage APIs, cookies, and form inputs. Even seemingly safe sources like the page's own document.title are tainted if they were ever derived from another source.
Common DOM Sinks and Their Interpretation
| Sink | Context | What gets interpreted |
|---|---|---|
| element.innerHTML = x | HTML | Full HTML parser incl. event handlers |
| element.outerHTML = x | HTML | Same as innerHTML, replaces element |
| element.insertAdjacentHTML(pos, x) | HTML | Parsed as HTML, inserted as siblings |
| document.write(x) | HTML | Pre-load: full HTML; post-load: replaces document |
| el.setAttribute("href", x) | URL | javascript: URLs execute on click |
| el.setAttribute("srcdoc", x) | HTML | Iframe document body |
| eval(x) | JavaScript | Full JS execution |
| new Function(x) | JavaScript | Full JS execution |
| setTimeout(x, n) / setInterval(x, n) | JavaScript | String arg is evaled |
| location.href = x / location.assign(x) | URL | javascript: URLs execute |
| el.innerHTML += x | HTML + mXSS | Re-serializes and re-parses — extra risk |
| jQuery $(x) / .html(x) | HTML | Calls innerHTML internally |
| React dangerouslySetInnerHTML | HTML | React exit hatch that bypasses JSX escaping |
| Vue v-html | HTML | Directive that calls innerHTML |
The reviewer heuristic is simple: grep the codebase for every sink on this list, then for each hit, trace the value back through variable assignments, function calls, template literals, and property reads until you either reach a constant (safe) or a source (finding). Static analysis tools automate this trace, but understanding the method is what lets you review the paths the tools miss.
DOM Clobbering
DOM clobbering is an attack where an attacker who can inject even sanitized HTML (no <script>, no event handlers) uses element IDs and name attributes to overshadow JavaScript variables and object properties. Browsers expose every element with an id as a property of window and document, and <form> / <img> / <a> elements with a name are exposed by name too. Injected HTML can therefore hijack lookups that the application assumed were its own globals.
Consider a common pattern: front-end code checks whether a config object is present before applying defaults.
Vulnerable: lookup assumes a JS global
1// The author expects `config` to be set by a previous <script> tag.
2// If it's missing, fall back to a default.
3const config = window.config || { apiBase: '/api', trackingPixel: null };
4
5// Later the trackingPixel URL is written to the DOM:
6const img = document.createElement('img');
7img.src = config.trackingPixel || '/default.png';
8document.body.appendChild(img);If any user-controlled HTML ends up in the page — even through a sanitizer that allows <a> tags — an attacker can set window.config to a DOM element by clobbering it:
Attack payload — no script tags, no event handlers
1<!-- Allowed by most HTML sanitizers -->
2<a id="config"></a>
3<a id="config" name="trackingPixel" href="javascript:alert(document.cookie)"></a>When the page reads window.config, it no longer gets undefined — it gets an HTMLCollection of the two anchors. Property access config.trackingPixel returns the second anchor by its name attribute, and assigning that element to img.src stringifies it via the href attribute — which is javascript:alert(...). The sanitizer never touched an event handler; the attack works entirely through identifier shadowing.
Safer: look up config explicitly on a namespaced object
1// Attach config to an object that the attacker cannot clobber.
2// You can't create a DOM element whose id puts a property under another object.
3window.__APP__ = window.__APP__ || {};
4const config =
5 typeof window.__APP__.config === 'object' && window.__APP__.config !== null
6 ? window.__APP__.config
7 : { apiBase: '/api', trackingPixel: null };
8
9// Always validate types before use:
10const src = typeof config.trackingPixel === 'string' ? config.trackingPixel : '/default.png';
11// And always validate URL schemes (see Prevention section).Allowlist sanitizers are not enough
Most HTML sanitizers strip <script> and on* handlers but freely pass <a id>, <form name>, <img id>, and similar. Those are the entire attack surface for DOM clobbering. If your app ever reads a global by bare name (window.foo, document.foo, or implicit lookups of foo in an inline script), treat it as clobberable.
Which attribute combinations in injected HTML can clobber a JavaScript identifier without any script or event handler?
HTML Injection via DOM Sinks
The single most common DOM bug is untrusted data reaching an HTML-parsing sink. The canonical sink is innerHTML, but every API that parses markup is equally dangerous.
Vulnerable: search term rendered with innerHTML
1// A "search results" heading that echoes the user's query.
2const query = new URLSearchParams(location.search).get('q') ?? '';
3document.getElementById('results-header').innerHTML =
4 'Results for: ' + query;
5
6// Attacker visits /search?q=<img src=x onerror=alert(1)>
7// Browser parses the string, creates an <img>, fires onerror, runs JS.Anything that parses HTML belongs in the same category: outerHTML, insertAdjacentHTML, document.write, and every framework wrapper that delegates to them.
Framework escape hatches to review with suspicion
1// jQuery — the $(html) and .html() shortcuts call innerHTML.
2$('#header').html(user.displayName); // XSS if displayName is attacker-controlled
3
4// React — dangerouslySetInnerHTML is a literal name-warning.
5<div dangerouslySetInnerHTML={{ __html: comment.bodyHtml }} /> // Trust boundary
6
7// Vue — v-html is identical in effect.
8<div v-html="comment.bodyHtml"></div>
9
10// Angular — bypassSecurityTrustHtml disables the sanitizer.
11this.trustedHtml = this.sanitizer.bypassSecurityTrustHtml(raw);
12
13// Svelte — {@html raw} is Svelte's version of the same exit hatch.
14{@html user.bio}These APIs exist because sometimes you really do need to render HTML (e.g., a markdown renderer output). That is fine — but only when the string has been sanitized by a library whose only job is to sanitize, and whose rules you have audited. Never sanitize with a regex; HTML cannot be parsed with regular expressions.
Safe: textContent when you want text, createElement when you want structure
1// Case 1: You just want to show a string. Use textContent.
2const header = document.getElementById('results-header');
3header.textContent = 'Results for: ' + query; // cannot inject markup
4
5// Case 2: You need structure and attributes. Use the DOM builder API.
6const img = document.createElement('img');
7img.src = '/thumbs/' + encodeURIComponent(thumbId); // browser-escapes, not sanitized
8img.alt = userProvidedAlt; // setAttribute would also work
9document.body.appendChild(img);
10
11// Case 3: You must render HTML from untrusted markdown/rich text.
12import DOMPurify from 'dompurify';
13const clean = DOMPurify.sanitize(comment.bodyHtml, {
14 USE_PROFILES: { html: true },
15 FORBID_TAGS: ['style'],
16 FORBID_ATTR: ['style'],
17});
18el.innerHTML = clean; // clean was produced by a sanitizer, not by string concatAttributes are a sink too
setAttribute("href", userInput) and setAttribute("src", userInput) are not safe just because they skip the HTML parser. URL attributes accept javascript: schemes that execute when the link is clicked or the iframe is loaded. Always validate that user-supplied URLs start with http:, https:, or / before assignment.
DOM-Based XSS: Source-to-Sink Tracing
The most reliable way to find DOM-based XSS in a review is the sink-first trace. You don't try to enumerate every possible attacker input — there are too many, and the framework may rename them. Instead, you locate every dangerous sink and walk upstream, asking for each read: "could this value contain attacker-controlled data?" You stop when you hit a literal constant (safe) or a source (finding).
Here is a realistic example that looks harmless at first glance.
A hash-based tab router
1// Single-page app uses the URL fragment to select a tab.
2function renderTab() {
3 const tabId = decodeURIComponent(location.hash.slice(1)) || 'overview';
4 const template = tabTemplates[tabId] ?? tabTemplates.overview;
5 document.getElementById('tab-content').innerHTML = template(tabId);
6}
7window.addEventListener('hashchange', renderTab);
8renderTab();
9
10// ...elsewhere in the codebase:
11const tabTemplates = {
12 overview: (id) => '<h2>' + id + '</h2><p>Welcome.</p>',
13 details: (id) => '<h2>' + id + '</h2><p>Details view.</p>',
14};Work the trace backwards from the sink. Write it out like a proof — every step is either "safe because..." or "tainted because..."
- Sink:
document.getElementById(...).innerHTML = template(tabId). Parses HTML. Need to prove the RHS is trusted. - RHS:
template(tabId). The template concatenates strings into<h2>...</h2>. Any reachableidvalue is spliced into HTML context unescaped. - template argument:
tabId. Where does it come from? One line above. - tabId definition:
decodeURIComponent(location.hash.slice(1)) || 'overview'. Right side of the||is a literal. Left side starts withlocation.hash— a source. - Finding: attacker-controlled
location.hashflows throughdecodeURIComponent(which is not a sanitizer — it decodes) into an HTML-context string concat, which reachesinnerHTML. - Proof-of-concept:
https://app.example.com/#<img src=x onerror=alert(1)>. Remediation: usetextContentfor the heading, or DOMPurify if HTML is required. ValidatetabIdagainst the known keys before any interpolation.
The subtle lesson: decodeURIComponent looks like it does something security-relevant because it has a complicated name — but it is a decoder, not a sanitizer. Many reviewers mentally discount a value once they see "some transformation" applied, even when the transformation doesn't remove the dangerous characters.
Common almost-safe functions that are NOT sanitizers
decodeURIComponent, encodeURIComponent (encoder for URLs only — still allows < in the result of the wrong flow), JSON.parse, atob / btoa, String(x), and .trim(). None of them strip HTML-significant characters in a way that makes the output safe for innerHTML.
Which of the following guarantees that an attacker-controlled string is safe to assign to `.innerHTML`?
Prevention Techniques
Prevention is layered. The outer layer is API choice: prefer DOM APIs that do not parse HTML. The middle layer is sanitization for the rare cases where you truly must render HTML. The inner, browser-enforced layer is Trusted Types plus CSP, which turns a forgotten innerHTML = somewhere in a 500-file app from a shipped vulnerability into a runtime exception.
Layer 1: prefer APIs that do not parse
1// Text content? Always textContent.
2el.textContent = userProvidedString;
3
4// Building structured DOM? Use createElement + setAttribute.
5const a = document.createElement('a');
6a.textContent = linkLabel; // safe
7// For URL attributes, validate scheme:
8const url = new URL(userHref, location.origin);
9if (url.protocol === 'http:' || url.protocol === 'https:') {
10 a.href = url.href;
11}
12
13// Setting classes or data? Dedicated APIs exist.
14el.classList.add('active');
15el.dataset.userId = String(userId);Layer 2: DOMPurify with a strict configuration
1import DOMPurify from 'dompurify';
2
3// Start with a narrow allowlist. Widen only with justification.
4const clean = DOMPurify.sanitize(dirty, {
5 ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'a', 'p', 'br', 'ul', 'ol', 'li', 'code', 'pre'],
6 ALLOWED_ATTR: ['href', 'title'],
7 ALLOW_DATA_ATTR: false,
8 FORBID_TAGS: ['style', 'form', 'svg', 'math'], // mXSS vectors
9 FORBID_ATTR: ['style', 'onerror', 'onload'],
10 ALLOWED_URI_REGEXP: /^(?:(?:https?|mailto):|\/)/i,
11});
12el.innerHTML = clean;
13
14// Keep DOMPurify up to date. Its version history is a history of mXSS bypasses.Layer 3: Trusted Types — the browser refuses strings at sinks
1// Define a single sanitizer policy. All innerHTML assignments must
2// receive a TrustedHTML object produced by this policy, or the
3// browser throws a TypeError at the sink.
4if (window.trustedTypes && trustedTypes.createPolicy) {
5 const policy = trustedTypes.createPolicy('app-html', {
6 createHTML: (dirty) => DOMPurify.sanitize(dirty, { RETURN_TRUSTED_TYPE: true }),
7 });
8
9 el.innerHTML = policy.createHTML(userMarkdown);
10}Layer 3 (cont.): enforce via Content-Security-Policy header
1Content-Security-Policy: require-trusted-types-for 'script'; trusted-types app-html default;
2
3# Effect:
4# - innerHTML, outerHTML, insertAdjacentHTML, document.write, eval, etc.
5# all throw TypeError when given a raw string.
6# - Only TrustedHTML / TrustedScript / TrustedScriptURL values are accepted.
7# - This catches every third-party widget, legacy code path, and
8# forgotten template helper that was still calling innerHTML with user data.URL attributes need their own allowlist
Trusted Types does not cover javascript: URLs in href, src, or action attributes. You still need scheme validation — check new URL(value).protocol against an explicit list of http:, https:, and mailto: (or just the ones your app needs). Reject or strip everything else before assignment.
Frameworks each have a preferred-safe and a dangerous path. Know both for every framework in the repo.
- React: JSX auto-escapes text children and attribute values. Never use
dangerouslySetInnerHTMLwithout sanitization; never buildhrefwith string concat — compute the URL withnew URL()and validate the protocol. - Vue:
{{ mustache }}andv-bindescape by default.v-htmlis the exit hatch; treat it likedangerouslySetInnerHTML. - Angular: the DomSanitizer escapes by default when binding to
[innerHTML].bypassSecurityTrustHtmland friends disable the sanitizer — audit every call. - Svelte:
{value}escapes.{@html value}is the exit hatch. - Solid / Lit / Preact: same pattern — default-safe template syntax, one named dangerous API. Grep for the dangerous API across the whole repo on every security review.
Prevention checklist
1) Default to textContent, never innerHTML, for plain strings. 2) For rich text, sanitize with DOMPurify using a narrow allowlist, and keep the library current. 3) Validate all URL attributes against an explicit scheme allowlist. 4) Turn on Trusted Types with require-trusted-types-for 'script' in CSP. 5) Namespace your globals (window.__APP__.config) so DOM clobbering cannot shadow them. 6) Ban inline event handlers and inline scripts with CSP. 7) Lint for unsafe sinks in CI. 8) Audit every framework exit hatch (dangerouslySetInnerHTML, v-html, {@html}, bypassSecurityTrustHtml) in every PR.
Security Tools & Linters
DOM sink review is something static tools do well. You should layer a static check that fails the build with a runtime check that fails the request.
Tooling by layer
| Tool | Layer | What it catches |
|---|---|---|
| eslint-plugin-no-unsanitized | Lint (dev) | innerHTML, outerHTML, insertAdjacentHTML, document.write, jQuery.html with non-literal arguments |
| @microsoft/eslint-plugin-sdl | Lint (dev) | no-inner-html, no-document-write, no-html-method, no-insecure-url |
| Semgrep (javascript.browser.security) | CI | Source→sink taint flows, framework-specific dangerous APIs |
| CodeQL (js/xss-through-dom) | CI | Full dataflow analysis across files, catches multi-step taint paths |
| DOMPurify | Runtime | HTML sanitization for cases where markup is required |
| Trusted Types polyfill + CSP | Runtime (browser) | Forces every sink to receive a TrustedHTML; throws on raw strings |
| Browser devtools (Issues tab) | Dev-time | Reports Trusted Types and CSP violations during development |
Here is a drop-in Semgrep rule for catching the most common sink:
semgrep-rules/dom-innerhtml.yaml
1rules:
2 - id: no-user-input-to-innerhtml
3 message: >-
4 Assignment to .innerHTML from a non-literal value. Use textContent for
5 plain strings, or DOMPurify.sanitize() for rich HTML.
6 severity: ERROR
7 languages: [javascript, typescript]
8 patterns:
9 - pattern-either:
10 - pattern: $EL.innerHTML = $VAL
11 - pattern: $EL.outerHTML = $VAL
12 - pattern: $EL.insertAdjacentHTML($POS, $VAL)
13 - pattern-not: $EL.innerHTML = "..."
14 - pattern-not: $EL.innerHTML = `...`
15 - pattern-not: $EL.innerHTML = DOMPurify.sanitize(...)
16 - pattern-not: $EL.innerHTML = $POLICY.createHTML(...)ESLint config for a React app — the lines that matter are the few sink-related rules. Add them to every frontend repo, not just the ones you think are security-sensitive.
.eslintrc (excerpt)
1{
2 "plugins": ["no-unsanitized", "@microsoft/sdl"],
3 "rules": {
4 "no-unsanitized/method": "error",
5 "no-unsanitized/property": "error",
6 "@microsoft/sdl/no-inner-html": "error",
7 "@microsoft/sdl/no-document-write": "error",
8 "@microsoft/sdl/no-insecure-url": "error",
9 "react/no-danger": "warn"
10 }
11}