Recovering True Classifier Performance in Positive-Unlabeled Learning

被引:0
|
作者
Jain, Shantanu [1 ]
White, Martha [1 ]
Radivojac, Predrag [1 ]
机构
[1] Indiana Univ, Dept Comp Sci, Bloomington, IN 47405 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A common approach in positive-unlabeled learning is to train a classification model between labeled and unlabeled data. This strategy is in fact known to give an optimal classifier under mild conditions; however, it results in biased empirical estimates of the classifier performance. In this work, we show that the typically used performance measures such as the receiver operating characteristic curve, or the precision recall curve obtained on such data can be corrected with the knowledge of class priors; i.e., the proportions of the positive and negative examples in the unlabeled data. We extend the results to a noisy setting where some of the examples labeled positive are in fact negative and show that the correction also requires the knowledge of the proportion of noisy examples in the labeled positives. Using state-of-the-art algorithms to estimate the positive class prior and the proportion of noise, we experimentally evaluate two correction approaches and demonstrate their efficacy on real-life data.
引用
收藏
页码:2066 / 2072
页数:7
相关论文
共 50 条
  • [31] Screening drug-target interactions with positive-unlabeled learning
    Lihong Peng
    Wen Zhu
    Bo Liao
    Yu Duan
    Min Chen
    Yi Chen
    Jialiang Yang
    Scientific Reports, 7
  • [32] A Positive-Unlabeled Learning Algorithm for Urban Flood Susceptibility Modeling
    Li, Wenkai
    Liu, Yuanchi
    Liu, Ziyue
    Gao, Zhen
    Huang, Huabing
    Huang, Weijun
    LAND, 2022, 11 (11)
  • [33] Positive-unlabeled learning in bioinformatics and computational biology: a brief review
    Li, Fuyi
    Dong, Shuangyu
    Leier, Andre
    Han, Meiya
    Guo, Xudong
    Xu, Jing
    Wang, Xiaoyu
    Pan, Shirui
    Jia, Cangzhi
    Zhang, Yang
    Webb, Geoffrey, I
    Coin, Lachlan J. M.
    Li, Chen
    Song, Jiangning
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
  • [34] Deep Generative Positive-Unlabeled Learning under Selection Bias
    Na, Byeonghu
    Kim, Hyemi
    Song, Kyungwoo
    Joo, Weonyoung
    Kim, Yoon-Yeong
    Moon, Il-Chul
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1155 - 1164
  • [35] Screening drug-target interactions with positive-unlabeled learning
    Peng, Lihong
    Zhu, Wen
    Liao, Bo
    Duan, Yu
    Chen, Min
    Chen, Yi
    Yang, Jialiang
    SCIENTIFIC REPORTS, 2017, 7
  • [36] Spotting Fake Reviews via Collective Positive-Unlabeled Learning
    Li, Huayi
    Chen, Zhiyuan
    Liu, Bing
    Wei, Xiaokai
    Shao, Jidong
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, : 899 - 904
  • [37] AdaSampling for Positive-Unlabeled and Label Noise Learning With Bioinformatics Applications
    Yang, Pengyi
    Ormerod, John T.
    Liu, Wei
    Ma, Chendong
    Zomaya, Albert Y.
    Yang, Jean Y. H.
    IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (05) : 1932 - 1943
  • [38] Biometric identity recognition based on contrastive positive-unlabeled learning
    Sun, Le
    Hua, Yiwen
    Muhammad, Ghulam
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2024, 83
  • [39] Positive-unlabeled learning for coronary artery segmentation in CCTA images
    Chen, Fei
    Li, Sulei
    Wei, Chen
    Zhang, Yue
    Guo, Kaitai
    Zheng, Yang
    Cao, Feng
    Liang, Jimin
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 87
  • [40] A flexible procedure for mixture proportion estimation in positive-unlabeled learning
    Lin, Zhenfeng
    Long, James P.
    STATISTICAL ANALYSIS AND DATA MINING, 2020, 13 (02) : 178 - 187