Logistic Regression and Random Forest for Effective Imbalanced Classification

被引:12
|
作者
Luo, Hanwu [1 ]
Pan, Xiubao [1 ]
Wang, Qingshun [2 ]
Ye, Shasha [2 ]
Qian, Ying [2 ]
机构
[1] East Inner Mongolia Elect Power Co Ltd, Hohhot, Peoples R China
[2] East China Normal Univ, Dept Comp Sci & Technol, Shanghai, Peoples R China
关键词
imbalanced classification; Random Forest; Logistic Regression; cost-sensitive classification;
D O I
10.1109/COMPSAC.2019.00139
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Nowadays, the application of data mining and machine learning techniques continues to be common in many fields. There are many imbalanced datasets with much less significant samples than unimportance ones in real-life because it is hard to collect representative positive examples. Under these circumstances, the conventional aim of reducing overall classification accuracy and most of the standard machine learning methods may not be suitable for the imbalanced problem. In this work, we compare the performance of random forest and logistic regression on the prediction of an imbalanced dataset. We propose several ways to enhance two models based on cost-sensitive learning to provide more accurate predictions when dealing with imbalanced datasets.
引用
收藏
页码:916 / 917
页数:2
相关论文
共 50 条
  • [21] Imbalanced data classification based on DB-SLSMOTE and random forest
    Han, Qi
    Yang, Rui
    Wan, Zitong
    Chen, Shaozhi
    Huang, Mengjie
    Wen, Huiqing
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 6271 - 6276
  • [22] MARGIN-BASED RANDOM FOREST FOR IMBALANCED LAND COVER CLASSIFICATION
    Feng, W.
    Boukir, S.
    Huang, W.
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 3085 - 3088
  • [23] A Comparison of Logistic Regression, Random Forest Models in Predicting the Risk of Diabetes
    Zhang, Baoxin
    Lu, Li
    Hou, Jiaqi
    THIRD INTERNATIONAL SYMPOSIUM ON IMAGE COMPUTING AND DIGITAL MEDICINE (ISICDM 2019), 2019, : 231 - 234
  • [24] Prediction of unsuccessful endometrial ablation: random forest vs logistic regression
    Stevens, Kelly Yvonne Roger
    Lagaert, Liesbet
    Bakkes, Tom
    Gelderblom, Malou Evi
    Houterman, Saskia
    Gijsen, Tanja
    Schoot, Benedictus C.
    GYNECOLOGICAL SURGERY, 2021, 18 (01)
  • [25] Logistic regression and random forest unveil key molecular descriptors of druglikeness
    Billones, Liza T.
    Morales, Nadia B.
    Billones, Junie B.
    CHEM-BIO INFORMATICS JOURNAL, 2021, 21 : 39 - 58
  • [26] Determinants of Stock Option Listing: Logistic Regression and Random Forest Approach
    Joshi, Himanshu
    Chauhan, Raineesh
    PACIFIC BUSINESS REVIEW INTERNATIONAL, 2020, 13 (01): : 1 - 12
  • [27] Logistic regression for imbalanced learning based on clustering
    Guo, Huaping
    Wei, Tao
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2019, 18 (01) : 54 - 64
  • [28] Random forest: A classification and regression tool for compound classification and QSAR modeling
    Svetnik, V
    Liaw, A
    Tong, C
    Culberson, JC
    Sheridan, RP
    Feuston, BP
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (06): : 1947 - 1958
  • [29] A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility
    Chen, Wei
    Xie, Xiaoshen
    Wang, Jiale
    Pradhan, Biswajeet
    Hong, Haoyuan
    Bui, Dieu Tien
    Duan, Zhao
    Ma, Jianquan
    CATENA, 2017, 151 : 147 - 160
  • [30] Classification and Prediction of Heart Disease using Novel Random Forest Algorithm by Comparing Logistic Regression for Obtaining Better Accuracy
    Poojitha, T.
    Mahaveerakannan, R.
    CARDIOMETRY, 2022, (25): : 1538 - 1545