Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction

被引：2

作者：

Zhu, Zhangchi ^{[1
,2
]}

Wang, Lu ^{[2
]}

Zhao, Pu ^{[2
]}

Du, Chao ^{[2
]}

Zhang, Wei ^{[1
]}

Dong, Hang ^{[2
]}

Qiao, Bo ^{[2
]}

Lin, Qingwei ^{[2
]}

Rajmohan, Saravan ^{[3
]}

Zhang, Dongmei ^{[2
]}

机构：

[1] East China Normal Univ, Shanghai, Peoples R China

[2] Microsoft Res, Beijing, Peoples R China

[3] Microsoft 365, Seattle, WA USA

来源：

PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年

基金：

中国国家自然科学基金;

关键词：

positive-unlabeled learning; curriculum learning;

D O I：

10.1145/3580305.3599491

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Learning from positive and unlabeled data is known as positive-unlabeled (PU) learning in literature and has attracted much attention in recent years. One common approach in PU learning is to sample a set of pseudo-negatives from the unlabeled data using ad-hoc thresholds so that conventional supervised methods can be applied with both positive and negative samples. Owing to the label uncertainty among the unlabeled data, errors of misclassifying unlabeled positive samples as negative samples inevitably appear and may even accumulate during the training processes. Those errors often lead to performance degradation and model instability. To mitigate the impact of label uncertainty and improve the robustness of learning with positive and unlabeled data, we propose a new robust PU learning method with a training strategy motivated by the nature of human learning: easy cases should be learned first. Similar intuition has been utilized in curriculum learning to only use easier cases in the early stage of training before introducing more complex cases. Specifically, we utilize a novel "hardness" measure to distinguish unlabeled samples with a high chance of being negative from unlabeled samples with large label noise. An iterative training strategy is then implemented to fine-tune the selection of negative samples during the training process in an iterative manner to include more "easy" samples in the early stage of training. Extensive experimental validations over a wide range of learning tasks show that this approach can effectively improve the accuracy and stability of learning with positive and unlabeled data. Our code is available at https://github.com/woriazzc/Robust-PU.

引用

页码：3663 / 3673

页数：11

共 50 条

[11] Positive-Unlabeled Learning in Streaming Networks
Chang, Shiyu
Zhang, Yang
Tang, Jiliang
Yin, Dawei
Chang, Yi
Hasegawa-Johnson, Mark A.
Huang, Thomas S.
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 755 - 764
[12] Positive-Unlabeled Learning for Knowledge Distillation
Ning Jiang
Jialiang Tang
Wenxin Yu
Neural Processing Letters, 2023, 55 : 2613 - 2631
[13] Intrusion Detection based on Non-negative Positive-unlabeled Learning
Lv, Sicai
Liu, Yang
Liu, Zhiyao
Chao, Wang
Wu, Chenrui
Wang, Bailing
PROCEEDINGS OF 2020 IEEE 9TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS'20), 2020, : 1015 - 1020
[14] Positive-Unlabeled Learning for Knowledge Distillation
Jiang, Ning
Tang, Jialiang
Yu, Wenxin
NEURAL PROCESSING LETTERS, 2023, 55 (03) : 2613 - 2631
[15] A boosting framework for positive-unlabeled learning
Zhao, Yawen
Zhang, Mingzhe
Zhang, Chenhao
Chen, Weitong
Ye, Nan
Xu, Miao
STATISTICS AND COMPUTING, 2025, 35 (01)
[16] Estimating classification accuracy in positive-unlabeled learning: characterization and correction strategies
Ramola, Rashika
Jain, Shantanu
Radivojac, Predrag
PACIFIC SYMPOSIUM ON BIOCOMPUTING 2019, 2019, : 124 - 135
[17] Entropy Weight Allocation: Positive-unlabeled Learning via Optimal Transport
Gu, Wen
Zhang, Teng
Jin, Hai
PROCEEDINGS OF THE 2022 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2022, : 37 - 45
[18] Positive-Unlabeled Learning With Label Distribution Alignment
Jiang, Yangbangyan
Xu, Qianqian
Zhao, Yunrui
Yang, Zhiyong
Wen, Peisong
Cao, Xiaochun
Huang, Qingming
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15345 - 15363
[19] Positive-Unlabeled Learning for Network Link Prediction
Gan, Shengfeng
Alshahrani, Mohammed
Liu, Shichao
MATHEMATICS, 2022, 10 (18)
[20] Correction to: Semi-supervised AUC optimization based on positive-unlabeled learning
Tomoya Sakai
Gang Niu
Masashi Sugiyama
Machine Learning, 2018, 107 : 795 - 795

← 1 2 3 4 5 →