Refining the sample complexity of comparative learning
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The PAC (Probably Approximately Correct) framework is a well-established theoretical framework for analyzing the statistical (and sometimes computational) complexity of machine learning tasks. Comparative learning is a recently introduced variation of the PAC framework that interpolates between the two standard extreme settings of realizable and agnostic PAC learning. In comparative learning the labeling is assumed to be from one hypothesis class (the source) while the learner's performance is to be measured against another hypothesis class (the benchmark). This setup allows for incorporating more specific prior knowledge into PAC-type learning bounds, which are known to be otherwise overly pessimistic. In this work we study the sample complexity of a variation of this setting we call proper comparative learning where we require the learning algorithm to output a hypothesis from the benchmark class. This setting represents model distillation tasks, where a predictor with specific requirements (e.g., interpretability) is trained on the labels from another model.