Dissertation Defense

Designing and Evaluating Physical Adversarial Attacks and Defenses for Machine Learning Algorithms

Kevin Eykholt

Studies show that state-of-the-art deep neural networks (DNNs) are vulnerable to adversarial examples, resulting from small-magnitude perturbations added to the input in a calculated fashion. These perturbations induce mistakes in the network's output. However, there have not been any studies on the impact of adversarial attacks in the physical world. In this dissertation, we first explore the technical requirements of generating physical adversarial inputs through the manipulation of physical objects and, based on our analysis, design a new adversarial attack algorithm, Robust Physical Perturbations (RPP). We then develop a defensive technique, Robust Feature Augmentation, to mitigate the effect of adversarial inputs. We hypothesize the input to a machine learning algorithm contains predictive feature information a bounded adversary is unable to manipulate in order to cause classification errors. By extracting this adversarially robust feature information, we can obtain evidence of the possible set of output labels and correct the classification decision accordingly. Due to the safety-critical nature of autonomous driving, we use traffic sign classification and localization tasks to demonstrate the success of our attack and defense.

Sponsored by

Prof. Atul Prakash