A Characterization of Multiclass Learnability

被引:10
|
作者
Brukhim, Nataly [1 ]
Carmon, Daniel [2 ]
Dinur, Irit [3 ]
Moran, Shay [2 ,4 ]
Yehudayoff, Amir [2 ]
机构
[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
[2] Technion, Dept Math, Haifa, Israel
[3] Weizmann Inst Sci, Dept Comp Sci, Rehovot, Israel
[4] Technion, Dept Comp Sci, Haifa, Israel
关键词
D O I
10.1109/FOCS54457.2022.00093
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A seminal result in learning theory characterizes the PAC learnability of binary classes through the Vapnik-Chervonenkis dimension. Extending this characterization to the general multiclass setting has been open since the pioneering works on multiclass PAC learning in the late 1980s. This work resolves this problem: we characterize multiclass PAC learnability through the DS dimension, a combinatorial dimension defined by Daniely and Shalev-Shwartz (2014). The classical characterization of the binary case boils down to empirical risk minimization. In contrast, our characterization of the multiclass case involves a variety of algorithmic ideas; these include a natural setting we call list PAC learning. In the list learning setting, instead of predicting a single outcome for a given unseen input, the goal is to provide a short menu of predictions. Our second main result concerns the Natarajan dimension, which has been a central candidate for characterizing multiclass learnability. This dimension was introduced by Natarajan (1988) as a barrier for PAC learning. He furthered showed that it is the only barrier, provided that the number of labels is bounded. Whether the Natarajan dimension characterizes PAC learnability in general has been posed as an open question in several papers since. This work provides a negative answer: we construct a non-learnable class with Natarajan dimension 1. For the construction, we identify a fundamental connection between concept classes and topology (i.e., colorful simplicial complexes). We crucially rely on a deep and involved construction of hyperbolic pseudo-manifolds by Januszkiewicz and ' Swiatkowski. It is interesting that hyperbolicity is directly related to learning problems that are difficult to solve although no obvious barriers exist. This is another demonstration of the fruitful links machine learning has with different areas in mathematics.
引用
收藏
页码:943 / 955
页数:13
相关论文
共 50 条
  • [1] On Robust Multiclass Learnability
    Xu, Jingyuan
    Liu, Weiwei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [2] On Robust Multiclass Learnability
    Xu, Jingyuan
    Liu, Weiwei
    Advances in Neural Information Processing Systems, 2022, 35
  • [3] Multiclass Learnability and the ERM Principle
    Daniely, Amit
    Sabato, Sivan
    Ben-David, Shai
    Shalev-Shwartz, Shai
    JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 2377 - 2404
  • [4] Multiclass learnability and the ERM principle
    Daniely, Amit
    Sabato, Sivan
    Ben-David, Shai
    Shalev-Shwartz, Shai
    Journal of Machine Learning Research, 2015, 16 : 2377 - 2404
  • [5] Multiclass Learnability Does Not Imply Sample Compression
    Pabbaraju, Chirag
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 237, 2024, 237
  • [6] On the Learnability and Design of Output Codes for Multiclass Problems
    Koby Crammer
    Yoram Singer
    Machine Learning, 2002, 47 : 201 - 233
  • [7] Multiclass Online Learnability under Bandit Feedback
    Raman, Ananth
    Raman, Vinod
    Subedi, Unique
    Mehalel, Idan
    Tewari, Ambuj
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 237, 2024, 237
  • [8] On the learnability and design of output codes for multiclass problems
    Crammer, K
    Singer, Y
    MACHINE LEARNING, 2002, 47 (2-3) : 201 - 233
  • [9] A Characterization of Multioutput Learnability
    Raman, Vinod
    Subedi, Unique
    Tewari, Ambuj
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 54
  • [10] A Characterization of List Learnability
    Charikar, Moses
    Pabbaraju, Chirag
    PROCEEDINGS OF THE 55TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING, STOC 2023, 2023, : 1713 - 1726