Finding the Best Classification Threshold in Imbalanced Classification

被引:166
|
作者
Zou, Quan [1 ,2 ]
Xie, Sifa [2 ]
Lin, Ziyu [2 ]
Wu, Meihong [2 ]
Ju, Ying [2 ]
机构
[1] Tianjin Univ, Sch Comp Sci & Technol, Tianjin, Peoples R China
[2] Xiamen Univ, Dept Comp Sci, Xiamen, Peoples R China
关键词
Receiver Operating Characteristic (ROC); Protein remote homology detection; Imbalance data; F-score; JOINT VIBROARTHROGRAPHIC SIGNALS; REMOTE HOMOLOGY DETECTION; AMINO-ACID-COMPOSITION; MICRORNA PRECURSOR; NEURAL-NETWORK; PROTEIN; IDENTIFICATION; EVOLUTIONARY; SOFTWARE;
D O I
10.1016/j.bdr.2015.12.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification with imbalanced class distributions is a major problem in machine learning. Researchers have given considerable attention to the applications in many real-world scenarios. Although several works have utilized the area under the receiver operating characteristic (ROC) curve to select potentially optimal classifiers in imbalanced classifications, limited studies have been devoted to finding the classification threshold for testing or unknown datasets. In general, the classification threshold is simply set to 0.5, which is usually unsuitable for an imbalanced classification. In this study, we analyze the drawbacks of using ROC as the sole measure of imbalance in data classification problems. In addition, a novel framework for finding the best classification threshold is proposed. Experiments with SCOP v.1.53 data reveal that, with the default threshold set to 0.5, our proposed framework demonstrated a 20.63% improvement in terms of F-score compared with that of more commonly used methods. The findings suggest that the proposed framework is both effective and efficient. A web server and software tools are available via http://datamining.xmu.edu.cn/prht/orhttp://prht.sinaapp.com/. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:2 / 8
页数:7
相关论文
共 50 条
  • [41] Deep reinforcement learning for imbalanced classification
    Enlu Lin
    Qiong Chen
    Xiaoming Qi
    Applied Intelligence, 2020, 50 : 2488 - 2502
  • [42] Graph Classification with Imbalanced Data Sets
    Xiao, Gang-Song
    Chen, Xiao-Yun
    2011 FIRST ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2011, : 57 - 61
  • [43] Cost-Sensitive Neural Network with ROC-Based Moving Threshold for Imbalanced Classification
    Krawczyk, Bartosz
    Wozniak, Michal
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2015, 2015, 9375 : 45 - 52
  • [44] Classification of Wine Quality with Imbalanced Data
    Hu, Gongzhu
    Xi, Tan
    Mohammed, Faraz
    Miao, Huaikou
    PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2016, : 1712 - 1717
  • [45] Text Generation for Imbalanced Text Classification
    Akkaradamrongrat, Suphamongkol
    Kachamas, Pornpimon
    Sinthupinyo, Sukree
    2019 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE 2019), 2019, : 181 - 186
  • [46] Classification of imbalanced data with transparent kernels
    Lee, KK
    Gunn, SR
    Harris, CJ
    Reed, PAS
    IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 2410 - 2415
  • [47] Utilizing DTRS for Imbalanced Text Classification
    Zhou, Bing
    Yao, Yiyu
    Liu, Qingzhong
    ROUGH SETS, (IJCRS 2016), 2016, 9920 : 219 - 228
  • [48] Ensembles of α-Trees for Imbalanced Classification Problems
    Park, Yubin
    Ghosh, Joydeep
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (01) : 131 - 143
  • [49] A Novel Model for Imbalanced Data Classification
    Yin, Jian
    Gan, Chunjing
    Zhao, Kaiqi
    Lin, Xuan
    Quan, Zhe
    Wang, Zhi-Jie
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6680 - 6687
  • [50] Nearest Neighbor Distributions for Imbalanced Classification
    Kriminger, Evan
    Principe, Jose C.
    Lakshminarayan, Choudur
    2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,