Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction

被引:2
|
作者
Zhu, Zhangchi [1 ,2 ]
Wang, Lu [2 ]
Zhao, Pu [2 ]
Du, Chao [2 ]
Zhang, Wei [1 ]
Dong, Hang [2 ]
Qiao, Bo [2 ]
Lin, Qingwei [2 ]
Rajmohan, Saravan [3 ]
Zhang, Dongmei [2 ]
机构
[1] East China Normal Univ, Shanghai, Peoples R China
[2] Microsoft Res, Beijing, Peoples R China
[3] Microsoft 365, Seattle, WA USA
基金
中国国家自然科学基金;
关键词
positive-unlabeled learning; curriculum learning;
D O I
10.1145/3580305.3599491
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Learning from positive and unlabeled data is known as positive-unlabeled (PU) learning in literature and has attracted much attention in recent years. One common approach in PU learning is to sample a set of pseudo-negatives from the unlabeled data using ad-hoc thresholds so that conventional supervised methods can be applied with both positive and negative samples. Owing to the label uncertainty among the unlabeled data, errors of misclassifying unlabeled positive samples as negative samples inevitably appear and may even accumulate during the training processes. Those errors often lead to performance degradation and model instability. To mitigate the impact of label uncertainty and improve the robustness of learning with positive and unlabeled data, we propose a new robust PU learning method with a training strategy motivated by the nature of human learning: easy cases should be learned first. Similar intuition has been utilized in curriculum learning to only use easier cases in the early stage of training before introducing more complex cases. Specifically, we utilize a novel "hardness" measure to distinguish unlabeled samples with a high chance of being negative from unlabeled samples with large label noise. An iterative training strategy is then implemented to fine-tune the selection of negative samples during the training process in an iterative manner to include more "easy" samples in the early stage of training. Extensive experimental validations over a wide range of learning tasks show that this approach can effectively improve the accuracy and stability of learning with positive and unlabeled data. Our code is available at https://github.com/woriazzc/Robust-PU.
引用
收藏
页码:3663 / 3673
页数:11
相关论文
共 50 条
  • [21] Positive-Unlabeled Learning from Imbalanced Data
    Su, Guangxin
    Chen, Weitong
    Xu, Miao
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2995 - 3001
  • [22] Ensembles of density estimators for positive-unlabeled learning
    T. M. A. Basile
    N. Di Mauro
    F. Esposito
    S. Ferilli
    A. Vergari
    Journal of Intelligent Information Systems, 2019, 53 : 199 - 217
  • [23] Improving Non-Negative Positive-Unlabeled Learning for News Headline Classification
    Ji, Zhanlin
    Du, Chengyuan
    Jiang, Jiawen
    Zhao, Li
    Zhang, Haiyang
    Ganchev, Ivan
    IEEE ACCESS, 2023, 11 : 40192 - 40203
  • [24] Ensembles of density estimators for positive-unlabeled learning
    Basile, T. M. A.
    Di Mauro, N.
    Esposito, F.
    Ferilli, S.
    Vergari, A.
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2019, 53 (02) : 199 - 217
  • [25] Positive-unlabeled learning for disease gene identification
    Yang, Peng
    Li, Xiao-Li
    Mei, Jian-Ping
    Kwoh, Chee-Keong
    Ng, See-Kiong
    BIOINFORMATICS, 2012, 28 (20) : 2640 - 2647
  • [26] Positive-Unlabeled Learning in the Face of Labeling Bias
    Youngs, Noah
    Shasha, Dennis
    Bonneau, Richard
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 639 - 645
  • [27] Positive-Unlabeled Learning for Pupylation Sites Prediction
    Jiang, Ming
    Cao, Jun-Zhe
    BIOMED RESEARCH INTERNATIONAL, 2016, 2016
  • [28] Temporal Positive-unlabeled Learning for Biomedical Hypothesis Generation via Risk Estimation
    Akujuobi, Uchenna
    Chen, Jun
    Elhoseiny, Mohamed
    Spranger, Michael
    Zhang, Xiangliang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [29] Construction of Fatigue Criteria Through Positive-Unlabeled Learning
    Coudray, Olivier
    Bristiel, Philippe
    Dinis, Miguel
    Keribin, Christine
    Pamphile, Patrick
    FATIGUE & FRACTURE OF ENGINEERING MATERIALS & STRUCTURES, 2025, 48 (01) : 101 - 117
  • [30] Positive-unlabeled learning for open set domain adaptation
    Loghmani, Mohammad Reza
    Vincze, Markus
    Tommasi, Tatiana
    PATTERN RECOGNITION LETTERS, 2020, 136 : 198 - 204