Logistic Regression and Random Forest for Effective Imbalanced Classification

被引:12
|
作者
Luo, Hanwu [1 ]
Pan, Xiubao [1 ]
Wang, Qingshun [2 ]
Ye, Shasha [2 ]
Qian, Ying [2 ]
机构
[1] East Inner Mongolia Elect Power Co Ltd, Hohhot, Peoples R China
[2] East China Normal Univ, Dept Comp Sci & Technol, Shanghai, Peoples R China
关键词
imbalanced classification; Random Forest; Logistic Regression; cost-sensitive classification;
D O I
10.1109/COMPSAC.2019.00139
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Nowadays, the application of data mining and machine learning techniques continues to be common in many fields. There are many imbalanced datasets with much less significant samples than unimportance ones in real-life because it is hard to collect representative positive examples. Under these circumstances, the conventional aim of reducing overall classification accuracy and most of the standard machine learning methods may not be suitable for the imbalanced problem. In this work, we compare the performance of random forest and logistic regression on the prediction of an imbalanced dataset. We propose several ways to enhance two models based on cost-sensitive learning to provide more accurate predictions when dealing with imbalanced datasets.
引用
收藏
页码:916 / 917
页数:2
相关论文
共 50 条
  • [1] Imbalanced educational data classification: an effective approach with resampling and random forest
    Vo Thi Ngoc Chau
    Nguyen Hua Phung
    PROCEEDINGS OF 2013 IEEE RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES: RESEARCH, INNOVATION, AND VISION FOR THE FUTURE (RIVF), 2013, : 135 - 140
  • [2] A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification
    Kanish Shah
    Henil Patel
    Devanshi Sanghvi
    Manan Shah
    Augmented Human Research, 2020, 5 (1)
  • [3] Comparison of Heart Disease Classification with Logistic Regression Algorithm and Random Forest Algorithm
    Latifah, Firda Anindita
    Slamet, Isnandar
    Sugiyanto
    INTERNATIONAL CONFERENCE ON SCIENCE AND APPLIED SCIENCE (ICSAS2020), 2020, 2296
  • [4] Comparing Random Forest with Logistic Regression for Predicting Class-Imbalanced Civil War Onset Data
    Muchlinski, David
    Siroky, David
    He, Jingrui
    Kocher, Matthew
    POLITICAL ANALYSIS, 2016, 24 (01) : 87 - 103
  • [5] Comparing Random Forest with Logistic Regression for Predicting Class-Imbalanced Civil War Onset Data: A Comment
    Wang, Yu
    POLITICAL ANALYSIS, 2019, 27 (01) : 107 - 110
  • [6] Infinitely imbalanced logistic regression
    Owen, Art B.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2007, 8 : 761 - 773
  • [7] Infinitely imbalanced logistic regression
    Owen, Art B.
    Journal of Machine Learning Research, 2007, 8 : 761 - 773
  • [8] Comparisons of ADABOOST, KNN, SVM and Logistic Regression in Classification of Imbalanced Dataset
    Abd Rahman, Hezlin Aryani
    Wah, Yap Bee
    He, Haibo
    Bulgiba, Awang
    SOFT COMPUTING IN DATA SCIENCE, SCDS 2015, 2015, 545 : 54 - 64
  • [9] Analysis of English Writing Text Features Based on Random Forest and Logistic Regression Classification Algorithm
    Sun, Chuan
    Luo, Bo
    MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [10] Crime Prediction Model using Three Classification Techniques: Random Forest, Logistic Regression, and LightGBM
    Alsubayhin, Abdulrahman
    Ramzan, Muhammad Sher
    Alzahrani, Bander
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (01) : 240 - 251