Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction

被引:2
|
作者
Zhu, Zhangchi [1 ,2 ]
Wang, Lu [2 ]
Zhao, Pu [2 ]
Du, Chao [2 ]
Zhang, Wei [1 ]
Dong, Hang [2 ]
Qiao, Bo [2 ]
Lin, Qingwei [2 ]
Rajmohan, Saravan [3 ]
Zhang, Dongmei [2 ]
机构
[1] East China Normal Univ, Shanghai, Peoples R China
[2] Microsoft Res, Beijing, Peoples R China
[3] Microsoft 365, Seattle, WA USA
基金
中国国家自然科学基金;
关键词
positive-unlabeled learning; curriculum learning;
D O I
10.1145/3580305.3599491
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Learning from positive and unlabeled data is known as positive-unlabeled (PU) learning in literature and has attracted much attention in recent years. One common approach in PU learning is to sample a set of pseudo-negatives from the unlabeled data using ad-hoc thresholds so that conventional supervised methods can be applied with both positive and negative samples. Owing to the label uncertainty among the unlabeled data, errors of misclassifying unlabeled positive samples as negative samples inevitably appear and may even accumulate during the training processes. Those errors often lead to performance degradation and model instability. To mitigate the impact of label uncertainty and improve the robustness of learning with positive and unlabeled data, we propose a new robust PU learning method with a training strategy motivated by the nature of human learning: easy cases should be learned first. Similar intuition has been utilized in curriculum learning to only use easier cases in the early stage of training before introducing more complex cases. Specifically, we utilize a novel "hardness" measure to distinguish unlabeled samples with a high chance of being negative from unlabeled samples with large label noise. An iterative training strategy is then implemented to fine-tune the selection of negative samples during the training process in an iterative manner to include more "easy" samples in the early stage of training. Extensive experimental validations over a wide range of learning tasks show that this approach can effectively improve the accuracy and stability of learning with positive and unlabeled data. Our code is available at https://github.com/woriazzc/Robust-PU.
引用
收藏
页码:3663 / 3673
页数:11
相关论文
共 50 条
  • [1] PULNS: Positive-Unlabeled Learning with Effective Negative Sample Selector
    Luo, Chuan
    Zhao, Pu
    Chen, Chen
    Qiao, Bo
    Du, Chao
    Zhang, Hongyu
    Wu, Wei
    Cai, Shaowei
    He, Bing
    Rajmohan, Saravanakumar
    Lin, Qingwei
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8784 - 8792
  • [2] ROPU: A robust online positive-unlabeled learning algorithm
    Liang, Xijun
    Zhu, Kaili
    Xiao, An
    Wen, Ya
    Zhang, Kaili
    Wang, Suhang
    Jian, Ling
    KNOWLEDGE-BASED SYSTEMS, 2025, 309
  • [3] Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning
    Niu, Gang
    du Plessis, Marthinus C.
    Sakai, Tomoya
    Ma, Yao
    Sugiyama, Masashi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [4] AdaSampling for Positive-Unlabeled and Label Noise Learning With Bioinformatics Applications
    Yang, Pengyi
    Ormerod, John T.
    Liu, Wei
    Ma, Chendong
    Zomaya, Albert Y.
    Yang, Jean Y. H.
    IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (05) : 1932 - 1943
  • [5] Positive-Unlabeled Learning with Non-Negative Risk Estimator
    Kiryo, Ryuichi
    Niu, Gang
    du Plessis, Marthinus C.
    Sugiyama, Masashi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [6] GradPU: Positive-Unlabeled Learning via Gradient Penalty and Positive Upweighting
    Dai, Songmin
    Li, Xiaoqiang
    Zhou, Yue
    Ye, Xichen
    Liu, Tong
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7296 - +
  • [7] Density Estimators for Positive-Unlabeled Learning
    Basile, Teresa M. A.
    Di Mauro, Nicola
    Esposito, Floriana
    Ferilli, Stefano
    Vergari, Antonio
    NEW FRONTIERS IN MINING COMPLEX PATTERNS, NFMCP 2017, 2018, 10785 : 49 - 64
  • [8] Spotting Fake Reviews via Collective Positive-Unlabeled Learning
    Li, Huayi
    Chen, Zhiyuan
    Liu, Bing
    Wei, Xiaokai
    Shao, Jidong
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, : 899 - 904
  • [9] Generative Adversarial Positive-Unlabeled Learning
    Hou, Ming
    Chaib-draa, Brahim
    Li, Chao
    Zhao, Qibin
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2255 - 2261
  • [10] A multi-objective evolutionary algorithm for robust positive-unlabeled learning
    Qiu, Jianfeng
    Tang, Qi
    Tan, Ming
    Li, Kaixuan
    Xie, Juan
    Cai, Xiaoqiang
    Cheng, Fan
    INFORMATION SCIENCES, 2024, 678