Novel Examination of Interpretable Surrogates and Adversarial Robustness in Machine Learning

Urner, RuthChowdhury, Sadia2021-07-062021-07-062021-022021-07-06http://hdl.handle.net/10315/38437The lack of transparent output behavior is a significant source of mistrust in many of the currently most successful machine learning tools. Concern arises particularly in situations where the data generation changes, for example under marginal shift or under adversarial manipulations. We analyze the use of decision trees (a human interpretable model) for indicating marginal shift. We then investigate the role of the data generation for the validity of the interpretable surrogate and its implementation as both local and global interpretation methods. We often observed that the decision boundaries of the blackbox model was mostly sitting close to the original data manifold. This makes those regions vulnerable to imperceptible perturbations. Hence, we carefully argue that adversarial robustness should be defined as a locally adaptive measure complying with the underlying distribution. We then suggest a definition for an adaptive robust loss, an empirical version of it and a resulting data-augmentation framework.Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.Computer scienceNovel Examination of Interpretable Surrogates and Adversarial Robustness in Machine LearningElectronic Thesis or Dissertation2021-07-06Machine LearningInterpretabilityAdversarial ExamplesRobustnessDecision TreesNeural NetworksBinary LossRobust Loss