New Attack Technique ‘Sleepy Pickle’ Targets Machine Learning Models
The security risks posed by the Pickle format have once again come to the fore with the discovery of a new “hybrid machine learning (ML) model exploitation technique” dubbed Sleepy Pickle.
The attack method, per Trail of Bits, weaponizes the ubiquitous format used to package and distribute machine learning (ML) models to corrupt the model itself, posing a severe supply chain risk to an organization’s downstream customers.
“Sleepy Pickle is a stealthy and novel attack technique that targets the ML model itself rather than the underlying system,” security researcher Boyan Milanov said.
While pickle is a widely used serialization format by ML libraries like PyTorch, it can be used to carry out arbitrary code execution attacks simply by loading a pickle file (i.e., during deserialization).
“We suggest loading models from users and organizations you trust, relying on signed commits, and/or loading models from [TensorFlow] or Jax formats with the from_tf=True auto-conversion mechanism,” Hugging Face points out in its documentation.
Sleepy Pickle works by inserting a payload into a pickle file using open-source tools like Fickling, and then delivering it to a target host by using one of the four techniques such as an adversary-in-the-middle (AitM) attack, phishing, supply chain compromise, or the exploitation of a system weakness.
“When the file is deserialized on the victim’s system, the payload is executed and modifies the contained model in-place to insert backdoors, control outputs, or tamper with processed data before returning it to the user,” Milanov said.
Put differently, the payload injected into the pickle file containing the serialized ML model can be abused to alter model behavior by tampering with the model weights, or tampering with the input and output data processed by the model.
In a hypothetical attack scenario, the approach could be used to generate harmful outputs or misinformation that can have disastrous consequences to user safety (e.g., drink bleach to cure flu), steal user data when certain conditions are met, and attack users indirectly by generating manipulated summaries of news articles with links pointing to a phishing page.
Trail of Bits said that Sleepy Pickle can be weaponized by threat actors to maintain surreptitious access on ML systems in a manner that evades detection, given that the model is compromised when the pickle file is loaded in the Python process.
This is also more effective than directly uploading a malicious model to Hugging Face, as it can modify model behavior or output dynamically without having to entice their targets into downloading and running them.
“With Sleepy Pickle attackers can create pickle files that aren’t ML models but can still corrupt local models if loaded together,” Milanov said. “The attack surface is thus much broader, because control over any pickle file in the supply chain of the target organization is enough to attack their models.”
“Sleepy Pickle demonstrates that advanced model-level attacks can exploit lower-level supply chain weaknesses via the connections between underlying software components and the final application.”
From Sleepy Pickle to Sticky Pickle
Sleepy Pickle is not the only attack to be demonstrated by Trail of Bits, for the cybersecurity firm said it could be improved to achieve persistence in a compromised model and ultimately evade detection – a technique referred to as Sticky Pickle.
This variant “incorporates a self-replicating mechanism that propagates its malicious payload into successive versions of the compromised model,” Milanov said. “Additionally, Sticky Pickle uses obfuscation to disguise the malicious code to prevent detection by pickle file scanners.”
In doing so, the exploit remains persistent even in scenarios where a user opts to modify a compromised model and redistribute it using a new pickle file that’s beyond the attacker’s control.
To secure against Sleepy Pickle and other supply chain attacks, it’s advised to avoid using pickle files to distribute serialized models and only use models from trusted organizations and rely on safer file formats like SafeTensors.