Is Robust Machine Learning Possible?

Machine learning has shown remarkable success in solving complex classification problems, but current machine learning techniques produce models that are vulnerable to adversaries who may wish to confuse them, especially when used for security applications like malware classification.

The key assumption of machine learning is that a model that is trained on training data will perform well in deployment because the training data is representative of the data that will be seen when the classifier is deployed.

When machine learning classifiers are used in security applications, however, adversaries may be able to generate samples that exploit the invalidity of this assumption.

Our project is focused on understanding, evaluating, and improving the effectiveness of machine learning methods in the presence of motivated and sophisticated adversaries.

Projects

Genetic Search
Evolutionary framework to automatically find variants that preserve malicious behavior but evade a target classifier.
Feature Squeezing
Reducing the search space for adversaries by coalescing inputs.
(The top row shows L0 adversarial examples, squeezed by median smoothing.)

Papers

Weilin Xu, David Evans, Yanjun Qi. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks. 2018 Network and Distributed System Security Symposium. 18-21 February, San Diego, California. Full paper (15 pages): [PDF]

Weilin Xu, Yanjun Qi, and David Evans. Automatically Evading Classifiers A Case Study on PDF Malware Classifiers. Network and Distributed Systems Symposium 2016, 21-24 February 2016, San Diego, California. Full paper (15 pages): [PDF]

More Papers…

Talks



David Evans’ keynote talk at the 1st Deep Learning and Security Workshop (co-located with the 39th IEEE Symposium on Security and Privacy). San Francisco, California. 24 May 2018. [SpeakerDeck]


Weilin Xu’s talk at Network and Distributed System Security Symposium 2018. San Diego, CA. 21 February 2018.


David Evans’ Talk at Berkeley ICSI, 8 June 2017.


David Evans’ Talk at USENIX Enigma 2017, Oakland, CA, 1 February 2017. [Speaker Deck]

More Talks…

Code

EvadeML-Zoo: https://github.com/mzweilin/EvadeML-Zoo

Genetic Evasion: https://github.com/uvasrg/EvadeML (Weilin Xu)

Feature Squeezing: https://github.com/uvasrg/FeatureSqueezing (Weilin Xu) (supersceded by the EvadeML-Zoo toolkit)

Adversarial Learning Playground: https://github.com/QData/AdversarialDNN-Playground (Andrew Norton) (mostly supersceded by the EvadeML-Zoo toolkit)

Team

Weilin Xu (Lead PhD Student, leading work on Feature Squeezing and Genetic Evasion)
Mainuddin Ahmad Jonas (PhD student, working on adversarial examples)
Fnu Suya (PhD student, working on batch attacks)
Xiao Zhang (PhD student, working on cost-sensitive adversarial robustness)

Yuancheng Lin (Undergraduate researchers working on adversarial examples, since summer 2018) Helen Simecek (Undergraduate researcher working on Genetic Evasion, since 2017)
Matthew Wallace (Undergraduate researcher working on natural language deception, since summer 2018)

David Evans (Faculty Co-Advisor)
Yanjun Qi (Faculty Co-Advisor for Weilin Xu)
Yuan Tian (Faculty Co-Advisor for Fnu Suya)

Alumni

Johannes Johnson (Undergraduate researcher working on malware classification and evasion, summer 2018)
Anant Kharkar (Undergraduate Researcher worked on Genetic Evasion, 2016-2018)
Noah Kim (Undergraduate Researcher worked on EvadeML-Zoo, 2017)
Felix Park (Undergradaute Researcher, worked on color-aware preprocessors, 2017-2018)