Cost-sensitive positive and unlabeled learning

被引:17
|
作者
Chen, Xiuhua [1 ]
Gong, Chen [1 ,2 ]
Yang, Jian [1 ,3 ]
机构
[1] Nanjing Univ Sci & Technol, Key Lab Intelligent Percept & Syst High Dimens In, Sch Comp Sci & Engn, PCA Lab,Minist Educ, Nanjing, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[3] Jiangsu Key Lab Image & Video Understanding Socia, Minist Educ, Peoples R China
关键词
Positive and Unlabeled learning (PU learning); Class imbalance; Cost-sensitive learning; Generalization bound; SMOTE;
D O I
10.1016/j.ins.2021.01.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Positive and Unlabeled learning (PU learning) aims to train a binary classifier solely based on positively labeled and unlabeled data when negatively labeled data are absent or distributed too diversely. However, none of the existing PU learning methods takes the class imbalance problem into account, which significantly neglects the minority class and is likely to generate a biased classifier. Therefore, this paper proposes a novel algorithm termed "Cost-Sensitive Positive and Unlabeled learning" (CSPU) which imposes different misclassification costs on different classes when conducting PU classification. Specifically, we assign distinct weights to the losses caused by false negative and false positive examples, and employ double hinge loss to build our CSPU algorithm under the framework of empirical risk minimization. Theoretically, we analyze the computational complexity, and also derive a generalization error bound of CSPU which guarantees the good performance of our algorithm on test data. Empirically, we compare CSPU with the state-of-the-art PU learning methods on synthetic dataset, OpenML benchmark datasets, and real-world datasets. The results clearly demonstrate the superiority of the proposed CSPU to other comparators in dealing with class imbalanced tasks. (C) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:229 / 245
页数:17
相关论文
共 50 条
  • [31] Speech Separation By Cost-Sensitive Deep Learning
    Zhang, Xiao-Lei
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 159 - 162
  • [32] Cost-sensitive ensemble learning: a unifying framework
    Petrides, George
    Verbeke, Wouter
    DATA MINING AND KNOWLEDGE DISCOVERY, 2022, 36 (01) : 1 - 28
  • [33] Cost-sensitive reinforcement learning for credit risk
    C-Rella, Jorge
    Rego, David Martinez
    Vilar, Juan M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 272
  • [34] Cost-Sensitive Learning Methods for Imbalanced Data
    Nguyen Thai-Nghe
    Gantner, Zeno
    Schmidt-Thieme, Lars
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [35] Cost-Sensitive Active Visual Category Learning
    Vijayanarasimhan, Sudheendra
    Grauman, Kristen
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2011, 91 (01) : 24 - 44
  • [36] Cost-sensitive dictionary learning for face recognition
    Zhang, Guoqing
    Sun, Huaijiang
    Ji, Zexuan
    Yuan, Yun-Hao
    Sun, Quansen
    PATTERN RECOGNITION, 2016, 60 : 613 - 629
  • [37] Cost-Sensitive Trees for Interpretable Reinforcement Learning
    Nishtala, Siddharth
    Ravindran, Balaraman
    PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024, 2024, : 91 - 99
  • [38] Cost-Sensitive Active Learning for Incomplete Data
    Wang, Min
    Yang, Chunyu
    Zhao, Fei
    Min, Fan
    Wang, Xizhao
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (01): : 405 - 416
  • [39] Cost-sensitive learning based on Bregman divergences
    Raúl Santos-Rodríguez
    Alicia Guerrero-Curieses
    Rocío Alaiz-Rodríguez
    Jesús Cid-Sueiro
    Machine Learning, 2009, 76 : 271 - 285
  • [40] Partial Example Acquisition in Cost-Sensitive Learning
    Sheng, Victor S.
    Ling, Charles X.
    KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2007, : 638 - 646