Urner, RuthWyke, Chester Samuel2024-11-072024-11-072024-08-162024-11-07https://hdl.handle.net/10315/42476A major challenge for both theoretical treatment and practical application of unsupervised learning tasks, such as clustering, anomaly detection or generative modeling, is the inherent lack of quantifiable objectives. Choosing methods and evaluating outcomes is then often a matter of ad-hoc heuristics or personal taste. Anomaly detection is often employed as a preprocessing step to other learning tasks, and unsound decisions for this task may thus have far-reaching consequences. In this work, we propose an axiomatic framework for analyzing behaviours of anomaly detection methods. We propose a basic set of desirable properties (or axioms) for distance-based anomaly detection methods and identify dependencies and (in-)consistencies between subsets of these. In addition, we include empirical results, which demonstrate the benefits of this axiomatic perspective on behaviours of anomaly detection methods. Our experiments illustrate how some commonly employed algorithms violate, perhaps unexpectedly, a basic desirable property. Namely, we highlight a material problem with a commonly used method called Isolation Forest, related to infinite bands of space likely to be labelled as inliers that extend infinitely far away from the training data. Additionally, we experimentally demonstrate that another common method, Local Outlier Factor, is vulnerable to adversarial data poisoning. To conduct these experimental evaluations, a tool for dataset generation, experimentation and visualization was built, which is an additional contribution of this work.Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.Computer scienceAn Axiomatic Perspective on Anomaly DetectionElectronic Thesis or Dissertation2024-11-07Anomaly detectionOutlier detectionAxiomsTheoryUnsupervised learningDesirable properties