Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction

被引:2
|
作者
Zhu, Zhangchi [1 ,2 ]
Wang, Lu [2 ]
Zhao, Pu [2 ]
Du, Chao [2 ]
Zhang, Wei [1 ]
Dong, Hang [2 ]
Qiao, Bo [2 ]
Lin, Qingwei [2 ]
Rajmohan, Saravan [3 ]
Zhang, Dongmei [2 ]
机构
[1] East China Normal Univ, Shanghai, Peoples R China
[2] Microsoft Res, Beijing, Peoples R China
[3] Microsoft 365, Seattle, WA USA
基金
中国国家自然科学基金;
关键词
positive-unlabeled learning; curriculum learning;
D O I
10.1145/3580305.3599491
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Learning from positive and unlabeled data is known as positive-unlabeled (PU) learning in literature and has attracted much attention in recent years. One common approach in PU learning is to sample a set of pseudo-negatives from the unlabeled data using ad-hoc thresholds so that conventional supervised methods can be applied with both positive and negative samples. Owing to the label uncertainty among the unlabeled data, errors of misclassifying unlabeled positive samples as negative samples inevitably appear and may even accumulate during the training processes. Those errors often lead to performance degradation and model instability. To mitigate the impact of label uncertainty and improve the robustness of learning with positive and unlabeled data, we propose a new robust PU learning method with a training strategy motivated by the nature of human learning: easy cases should be learned first. Similar intuition has been utilized in curriculum learning to only use easier cases in the early stage of training before introducing more complex cases. Specifically, we utilize a novel "hardness" measure to distinguish unlabeled samples with a high chance of being negative from unlabeled samples with large label noise. An iterative training strategy is then implemented to fine-tune the selection of negative samples during the training process in an iterative manner to include more "easy" samples in the early stage of training. Extensive experimental validations over a wide range of learning tasks show that this approach can effectively improve the accuracy and stability of learning with positive and unlabeled data. Our code is available at https://github.com/woriazzc/Robust-PU.
引用
收藏
页码:3663 / 3673
页数:11
相关论文
共 50 条
  • [41] Automatic noise reduction of domain-specific bibliographic datasets using positive-unlabeled learning
    Guo Chen
    Jing Chen
    Yu Shao
    Lu Xiao
    Scientometrics, 2023, 128 : 1187 - 1204
  • [42] Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training
    Chen, Xuxi
    Chen, Wuyang
    Chen, Tianlong
    Yuan, Ye
    Gong, Chen
    Chen, Kewei
    Wang, Zhangyang
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [43] Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training
    Chen, Xuxi
    Chen, Wuyang
    Chen, Tianlong
    Yuan, Ye
    Gong, Chen
    Chen, Kewei
    Wang, Zhangyang
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [44] Principled analytic classifier for positive-unlabeled learning via weighted integral probability metric
    Kwon, Yongchan
    Kim, Wonyoung
    Sugiyama, Masashi
    Paik, Myunghee Cho
    MACHINE LEARNING, 2020, 109 (03) : 513 - 532
  • [45] PUe: Biased Positive-Unlabeled Learning Enhancement by Causal Inference
    Wang, Xutao
    Chen, Hanting
    Guo, Tianyu
    Wang, Yunhe
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [46] Screening drug-target interactions with positive-unlabeled learning
    Lihong Peng
    Wen Zhu
    Bo Liao
    Yu Duan
    Min Chen
    Yi Chen
    Jialiang Yang
    Scientific Reports, 7
  • [47] A Positive-Unlabeled Learning Algorithm for Urban Flood Susceptibility Modeling
    Li, Wenkai
    Liu, Yuanchi
    Liu, Ziyue
    Gao, Zhen
    Huang, Huabing
    Huang, Weijun
    LAND, 2022, 11 (11)
  • [48] Positive-unlabeled learning in bioinformatics and computational biology: a brief review
    Li, Fuyi
    Dong, Shuangyu
    Leier, Andre
    Han, Meiya
    Guo, Xudong
    Xu, Jing
    Wang, Xiaoyu
    Pan, Shirui
    Jia, Cangzhi
    Zhang, Yang
    Webb, Geoffrey, I
    Coin, Lachlan J. M.
    Li, Chen
    Song, Jiangning
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
  • [49] Deep Generative Positive-Unlabeled Learning under Selection Bias
    Na, Byeonghu
    Kim, Hyemi
    Song, Kyungwoo
    Joo, Weonyoung
    Kim, Yoon-Yeong
    Moon, Il-Chul
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1155 - 1164
  • [50] Screening drug-target interactions with positive-unlabeled learning
    Peng, Lihong
    Zhu, Wen
    Liao, Bo
    Duan, Yu
    Chen, Min
    Chen, Yi
    Yang, Jialiang
    SCIENTIFIC REPORTS, 2017, 7