F-measure maximizing logistic regression

被引:0
|
作者
Okabe, Masaaki [1 ]
Tsuchida, Jun [1 ]
Yadohisa, Hiroshi [1 ]
机构
[1] Doshisha Univ, Grad Sch Culture & Informat Sci, Kyoto, Japan
关键词
Density ratio; Discriminant analysis; Imbalanced data; Weighted importance; ROC;
D O I
10.1080/03610918.2022.2081706
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Logistic regression is a widely used method in several fields. When applying logistic regression to imbalanced data, wherein the majority classes dominate the minority classes, all class labels are estimated as "majority class." In this study, we use an F-measure optimization method to improve the performance of logistic regression applied to imbalanced data. Although many F-measure optimization methods adopt a ratio of the estimators to approximate the F-measure, the ratio of the estimators tends to exhibit more bias than when the ratio is directly approximated. Therefore, we employ an approximate F-measure to estimate the relative density ratio. In addition, we define and approximate a relative F-measure. We present an algorithm for a logistic regression weighted approximation relative to the F-measure. The results of an experiment using real world data demonstrate that our proposed algorithm can efficiently improve the performance of logistic regression applied to imbalanced data.
引用
收藏
页码:2554 / 2564
页数:11
相关论文
共 50 条
  • [31] A note on using the F-measure for evaluating record linkage algorithms
    David Hand
    Peter Christen
    Statistics and Computing, 2018, 28 : 539 - 547
  • [32] Maximizing proportions of correct classifications in binary logistic regression
    Hadjicostas, Petros
    JOURNAL OF APPLIED STATISTICS, 2006, 33 (06) : 629 - 640
  • [33] Faster Online Learning of Optimal Threshold for Consistent F-measure Optimization
    Liu, Mingrui
    Zhang, Xiaoxuan
    Zhou, Xun
    Yang, Tianbao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [34] Common Problems With the Usage of F-Measure and Accuracy Metrics in Medical Research
    Lavazza, Luigi
    Morasca, Sandro
    IEEE ACCESS, 2023, 11 : 51515 - 51526
  • [35] Comparing ϕ and the F-measure as performance metrics for software-related classifications
    Luigi Lavazza
    Sandro Morasca
    Empirical Software Engineering, 2022, 27
  • [36] Learning Invariant Region Descriptor Operators with Genetic Programming and the F-Measure
    Perez, Cynthia B.
    Olague, Gustavo
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3201 - 3204
  • [37] Deep F-measure Maximization for End-to-End Speech Understanding
    Sari, Leda
    Hasegawa-Johnson, Mark
    INTERSPEECH 2020, 2020, : 1580 - 1584
  • [38] Early Forecasting of Text Classification Accuracy and F-Measure with Active Learning
    Orth, Thomas
    Bloodgood, Michael
    2020 IEEE 14TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2020), 2020, : 77 - 84
  • [39] Comparing φ and the F-measure as performance metrics for software-related classifications
    Lavazza, Luigi
    Morasca, Sandro
    EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (07)
  • [40] From Cost-Sensitive Classification to Tight F-measure Bounds
    Bascol, Kevin
    Emonet, Remi
    Fromont, Elisa
    Habrard, Amaury
    Metzler, Guillaume
    Sebban, Marc
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89