Defending and attacking ML Malware Classifiers for Fun and Profit: 2x prize winner at MLSEC-2021

MLSEC (Machine Learning Security Evasion Competition) is an initiative sponsored by Microsoft and partners CUJO AI, NVIDIA, VMRay, and MRG Effitas with the purpose of raising awareness of the expanding attack surface which is now also affecting AI-powered systems.

In its 3rd edition the competition allowed defenders and attackers to exercise their security and machine learning skills under a plausible threat model: evading antimalware and anti-phishing filters. In the competition, defenders aimed to detect evasive submissions by using machine learning (ML), and attackers attempted to circumvent those detections.

The anti-malware track included two parts: the defensive part focused on creating anti-malware models that were able to withstand adversarial attacks with certain criteria regarding FPR/TPR and an offensive part in which competitors had to bypass the defender models by modifying a set of 50 malware samples provided by the organizers in a way that would still run and produce the same IOCs in an sandboxed environment.

Defensive anti-malware track

PE files have a very large feature space to defend and (even with droppers disallowed this year) it was obvious to me that any malware detection model that I could train with reasonable computing power and public training resources would be relatively easy to bypass.

Taking into account that the feedback loop between submitting a modified malware sample to the MLSEC backend and receiving the execution sandbox results involved a considerable delay (around 1h) and knowing that defeating models in a black box setting would require many queries, I decided to focus on slowing down certain attack paths by aiming for deterrence rather than model robustness. This had the goal of discouraging attackers trying to evade my system with the hypothesis that they will use their resources against easier targets first.

In order to do that I placed several layers of defense in the submitted "A1" solution:

First, traditional approaches were deployed, such as ensembles trained on SOREL and EMBER datasets using gradient boosted trees. Additional feature engineering was added targeting packers and concealed binaries.

Second, PE anomaly detection rules would look for suspicious data in sections and headers that are usually manipulated during adversarial tests.

Later, an stateful defense would compare each submitted file with previously convicted files submitted to the system. This has not only the ability to detect adversarial variants but forces attackers to produce substantially different files that still need to run successfully in the sandbox. The comparison method purposely includes only areas which are a priori harder to modify without a packer/dropper approach, such as data and code from the first PE section.

Finally, a moving target defense composed by an ensemble of non-deterministic classifiers that would slightly modify their decision function on each run was placed in order to further confuse attackers. This layer would concede evasions in a non-consistent way, providing contradictory clues about the evasion success (or failure).

These together did relatively well and placed me in a second "in-the-money" position very close to the winning approach:

There were some other "fun" restrictions, like having to pack the defender as a docker image with certain file size and keeping low latency when analyzing files and providing the verdict. In order to do that the docker base image had to be stripped to the max by deleting certain unused libraries and packages by hand, leaving the OS unstable in some cases...

Offensive anti-malware track

There were 6 defender systems unknown to the participants, however considering the fact that there are not many publicly available large-scale malware datasets besides EMBER and SOREL, It wasn't unreasonable to believe that attacks that would successful evade baseline models trained on these datasets should have in principle better chances of evasion, or the very least least a better starting point than targeting the defenders directly using black-box attacks.

Therefore the chosen strategy was performing a mix of white-box and black-box attacks against EMBER and SOREL baselines while also extracting knowledge from the evaded defender systems in order to manually fine-tune further attacks in an iterative way.

Black box:

Counterfit
Greedy byte manipulations

White box:

Feature importance of baseline SOREL and EMBER models
Lime and SHAP analysis of baseline SOREL and EMBER models:

Other approaches such as PE header fuzzing yielded few but interesting results, such as highlighting the fact that the EMBER feature set makes use of PE header fields ignored by the Windows loader, which opens the door to more interesting adversarial modifications by purposely generating apparently broken PEs which would still run fine in the sandbox. A few more were identified but are left as an exercise to the reader ;).

While droppers were not allowed in this edition, crypters/packers were. I've tried a few and had some moderate success for some malware binaries. The main caveat is that for the cases when the evasion fails the packer stub risks being fingerprinted by stateful defenses. Likewise, several malware samples were either loaders themselves or had some sort of integrity checks. These were probably cherry-picked by the organizers to limit packer effectiveness after their success in previous editions.

Overall, generating non-detected variants was relatively easy, the main challenge was generating variants that would evade detection and execute successfully in the organizers sandbox. I chose not testing the modified malware binaries locally before submission, which forced me to use extra API calls (almost 6 times more than the winning team).

I was ranked second on this track as well, with just 29 evasions less than the winning team.

Overall, the competition was fun and the experimental results quite interesting, but there are some things the organizers may want to address in future editions. For example, defenders FP rate was enforced only considering clean binaries from a Windows system. This caused some controversy as it was possible to submit a defender that would only detect as clean that particular set and everything else as malware.

References

amsqr at MLSEC-2021: Thwarting Adversarial Malware Evasion with a Defense-in-Depth

https://secret.inf.ufpr.br/2021/09/29/adversarial-machine-learning-malware-detection-and-the-2021s-mlsec-competition/

https://www.slideshare.net/MarcusBotacin/all-you-need-to-know-to-win-a-cybersecurity-adversarial-machine-learning-competition-and-make-malware-attacks-practical

Kipple: Towards accessible, robust malware classification

CC10 - Building & Defending a Machine Learning Malware Classifier: Taking 3rd at MLSEC 2021

Alejandro Mosquera | Blog

Thursday, March 17, 2022