Recovering True Classifier Performance in Positive-Unlabeled Learning

被引:0
|
作者
Jain, Shantanu [1 ]
White, Martha [1 ]
Radivojac, Predrag [1 ]
机构
[1] Indiana Univ, Dept Comp Sci, Bloomington, IN 47405 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A common approach in positive-unlabeled learning is to train a classification model between labeled and unlabeled data. This strategy is in fact known to give an optimal classifier under mild conditions; however, it results in biased empirical estimates of the classifier performance. In this work, we show that the typically used performance measures such as the receiver operating characteristic curve, or the precision recall curve obtained on such data can be corrected with the knowledge of class priors; i.e., the proportions of the positive and negative examples in the unlabeled data. We extend the results to a noisy setting where some of the examples labeled positive are in fact negative and show that the correction also requires the knowledge of the proportion of noisy examples in the labeled positives. Using state-of-the-art algorithms to estimate the positive class prior and the proportion of noise, we experimentally evaluate two correction approaches and demonstrate their efficacy on real-life data.
引用
收藏
页码:2066 / 2072
页数:7
相关论文
共 50 条
  • [41] Positive-Unlabeled Learning with Non-Negative Risk Estimator
    Kiryo, Ryuichi
    Niu, Gang
    du Plessis, Marthinus C.
    Sugiyama, Masashi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [42] Information-Theoretic Representation Learning for Positive-Unlabeled Classification
    Sakai, Tomoya
    Niu, Gang
    Sugiyama, Masashi
    NEURAL COMPUTATION, 2021, 33 (01) : 244 - 268
  • [43] Unsupervised Body Hair Detection by Positive-Unlabeled Learning in Photoacoustic Image
    Kikkawa, Ryo
    Kajita, Hiroki
    Imanishi, Nobuaki
    Aiso, Sadakazu
    Bise, Ryoma
    2021 43RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY (EMBC), 2021, : 3349 - 3352
  • [44] EPuL: An Enhanced Positive-Unlabeled Learning Algorithm for the Prediction of Pupylation Sites
    Nan, Xuanguo
    Bao, Lingling
    Zhao, Xiaosa
    Zhao, Xiaowei
    Sangaiah, Arun Kumar
    Wang, Gai-Ge
    Ma, Zhiqiang
    MOLECULES, 2017, 22 (09):
  • [45] Positive-unlabeled learning for the prediction of conformational B-cell epitopes
    Jing Ren
    Qian Liu
    John Ellis
    Jinyan Li
    BMC Bioinformatics, 16
  • [46] Positive-Unlabeled Learning for inferring drug interactions based on heterogeneous attributes
    Hameed, Pathima Nusrath
    Verspoor, Karin
    Kusljic, Snezana
    Halgamuge, Saman
    BMC BIOINFORMATICS, 2017, 18
  • [47] Entropy Weight Allocation: Positive-unlabeled Learning via Optimal Transport
    Gu, Wen
    Zhang, Teng
    Jin, Hai
    PROCEEDINGS OF THE 2022 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2022, : 37 - 45
  • [48] An Integrated Framework of Positive-Unlabeled and Imbalanced Learning for Landslide Susceptibility Mapping
    Fu, Zijin
    Ma, Hao
    Wang, Fawu
    Dou, Jie
    Zhang, Bo
    Fang, Zhice
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 15596 - 15611
  • [49] Computational Identification of Lysine Glutarylation Sites Using Positive-Unlabeled Learning
    Ju, Zhe
    Wang, Shi-Yun
    CURRENT GENOMICS, 2020, 21 (03) : 204 - 211
  • [50] Positive-Unlabeled Learning for inferring drug interactions based on heterogeneous attributes
    Pathima Nusrath Hameed
    Karin Verspoor
    Snezana Kusljic
    Saman Halgamuge
    BMC Bioinformatics, 18