On semi-supervised learning

被引:2
|
作者
Cholaquidis, A. [1 ]
Fraiman, R. [1 ]
Sued, M. [2 ]
机构
[1] Univ Republica, Fac Ciencias, Montevideo, Uruguay
[2] INst Calculo, Fac Ciencias Exactas & Nat, Buenos Aires, DF, Argentina
关键词
Semi-supervised learning; Small training sample; Consistency; PATTERN-RECOGNITION; ERROR;
D O I
10.1007/s11749-019-00690-2
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Major efforts have been made, mostly in the machine learning literature, to construct good predictors combining unlabelled and labelled data. These methods are known as semi-supervised. They deal with the problem of how to take advantage, if possible, of a huge amount of unlabelled data to perform classification in situations where there are few labelled data. This is not always feasible: it depends on the possibility to infer the labels from the unlabelled data distribution. Nevertheless, several algorithms have been proposed recently. In this work, we present a new method that, under almost necessary conditions, attains asymptotically the performance of the best theoretical rule when the size of the unlabelled sample goes to infinity, even if the size of the labelled sample remains fixed. Its performance and computational time are assessed through simulations and in the well- known "Isolet" real data of phonemes, where a strong dependence on the choice of the initial training sample is shown. The main focus of this work is to elucidate when and why semi-supervised learning works in the asymptotic regime described above. The set of necessary assumptions, although reasonable, show that semi-parametric methods only attain consistency for very well-conditioned problems.
引用
收藏
页码:914 / 937
页数:24
相关论文
共 50 条
  • [31] Semi-supervised Learning with Multimodal Perturbation
    Su, Lei
    Liao, Hongzhi
    Yu, Zhengtao
    Tang, Jiahua
    ADVANCES IN NEURAL NETWORKS - ISNN 2009, PT 1, PROCEEDINGS, 2009, 5551 : 651 - +
  • [32] Semi-Supervised Learning for ECG Classification
    Rodrigues, Rui
    Couto, Paula
    2021 COMPUTING IN CARDIOLOGY (CINC), 2021,
  • [33] Negative sampling in semi-supervised learning
    Chen, John
    Shah, Vatsal
    Kyrillidis, Anastasios
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [34] Semi-Supervised Learning for Video Captioning
    Lin, Ke
    Gan, Zhuoxin
    Wang, Liwei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1096 - 1106
  • [35] Semi-Supervised Learning by Gaussian Mixtures
    Choi, Byoung-Jeong
    Chae, Youn-Seok
    Choi, Woo-Young
    Park, Changyi
    Koo, Ja-Yong
    KOREAN JOURNAL OF APPLIED STATISTICS, 2008, 21 (05) : 825 - 833
  • [36] Semi-supervised learning by sparse representation
    Yan, Shuicheng
    Wang, Huan
    Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics, 2009, 2 : 788 - 797
  • [37] Semi-supervised Preference Learning Algorithm
    Zhao M.
    Liu J.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2019, 32 (10): : 909 - 916
  • [38] Quantum semi-supervised kernel learning
    Seyran Saeedi
    Aliakbar Panahi
    Tom Arodz
    Quantum Machine Intelligence, 2021, 3
  • [39] Information mining with semi-supervised learning
    Klose, A
    Kruse, R
    SOFT METHODOLOGY AND RANDOM INFORMATION SYSTEMS, 2004, : 67 - 74
  • [40] A Theoretical Analysis of Semi-supervised Learning
    Fujii, Takashi
    Ito, Hidetaka
    Miyoshi, Seiji
    NEURAL INFORMATION PROCESSING, ICONIP 2016, PT II, 2016, 9948 : 28 - 36