Evaluating the Performance of Data Level Methods Using KEEL Tool to Address Class Imbalance Problem

被引:0
|
作者
Kamlesh Upadhyay
Prabhjot Kaur
Deepak Kumar Verma
机构
[1] Lingayas Vidyapeeth,Department of Information Technology
[2] Maharaja Surajmal Institute of Technology,undefined
[3] Lingayas Vidyapeeth,undefined
关键词
Algorithm level approaches; Binary classification; Class imbalance problem; Data level approaches; Ensembled approach;
D O I
暂无
中图分类号
学科分类号
摘要
The class imbalance problem (CIP) has become a hot topic of machine learning in recent years because of its increasing importance in today’s era. As the application area of technology is increases, the size and variety of data also increases. By nature, most of the real-world raw data is present in imbalanced form like credit card frauds, fraudulent telephone calls, shuttle system failure, text classification, nuclear explosions, oil spill detection, detection of brain tumor images etc. The classification algorithms are not able to classify imbalance data accurately and their results always deviate toward the bigger class. This problem is known as Class Imbalance Problem. This paper assess various data level methods which are used to balance the data before classification. It also discusses various characteristics of data which impact class imbalance problem and the reasons why traditional classification algorithms are not able to tackle this issue. Apart from this it also discusses about other data abnormalities which makes the CIP more critical like size of data, overlapping classes, presence of noise in the data, data distribution within each class etc. The paper empirically compared 20 data-level classification methods with 44 UCI real imbalanced data-sets with the imbalance ratio ranging from as low as to 1.82 to as high as to 129.44 using KEEL tool. The performance of the methods is assessed using AUC, F-measure, G-mean metrics and the results are analyzed and represented graphically.
引用
收藏
页码:9741 / 9754
页数:13
相关论文
共 50 条
  • [1] Evaluating the Performance of Data Level Methods Using KEEL Tool to Address Class Imbalance Problem
    Upadhyay, Kamlesh
    Kaur, Prabhjot
    Verma, Deepak Kumar
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (08) : 9741 - 9754
  • [2] Implications of resampling data to address the class imbalance problem (IRCIP): an evaluation of impact on performance between classification algorithms in medical data
    Welvaars, Koen
    Oosterhoff, Jacobien H. F.
    van den Bekerom, Michel P. J.
    Doornberg, Job N.
    van Haarst, Ernst P.
    JAMIA OPEN, 2023, 6 (02)
  • [3] Data Sampling Methods to Deal With the Big Data Multi-Class Imbalance Problem
    Rendon, Erendira
    Alejo, Roberto
    Castorena, Carlos
    Isidro-Ortega, Frank J.
    Granda-Gutierrez, Everardo E.
    APPLIED SCIENCES-BASEL, 2020, 10 (04):
  • [4] A weighted rough set method to address the class imbalance problem
    Liu, Jin-Fu
    Yu, Da-Ren
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 3693 - 3698
  • [5] Survey of Fuzzy based techniques to address Class Imbalance Problem
    Kaur, Prahhjot
    Gupta, Anshul
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 2602 - 2604
  • [6] Effective management of class imbalance problem in climate data analysis using a hybrid of deep learning and data level sampling
    Aarthi, R. J.
    Vinayagasundaram, B.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (04) : 4187 - 4199
  • [7] THE METHODS FOR QUANTITATIVE SOLVING THE CLASS IMBALANCE PROBLEM
    Kavrin, D. A.
    Subbotin, S. A.
    RADIO ELECTRONICS COMPUTER SCIENCE CONTROL, 2018, (01) : 83 - 90
  • [8] Solving the class imbalance problem using a counterfactual method for data augmentation
    Temraz, Mohammed
    Keane, Mark T.
    MACHINE LEARNING WITH APPLICATIONS, 2022, 9
  • [9] Alleviating Class Imbalance Problem In Data Mining
    Sarmanova, Akkenzhe
    Albayrak, Songul
    2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
  • [10] Evolutionary data analysis for the class imbalance problem
    Khoshgoftaar, Taghi M.
    Seliya, Naeem
    Drown, Dennis J.
    INTELLIGENT DATA ANALYSIS, 2010, 14 (01) : 69 - 88