Cost-sensitive boosting algorithms: Do we really need them?

被引:41
|
作者
Nikolaou, Nikolaos [1 ]
Edakunni, Narayanan [1 ]
Kull, Meelis [2 ]
Flach, Peter [2 ]
Brown, Gavin [1 ]
机构
[1] Univ Manchester, Sch Comp Sci, Kilburn Bldg,Oxford Rd, Manchester M13 9PL, Lancs, England
[2] Univ Bristol, Dept Comp Sci, Merchant Venturers Bldg,Woodland Rd, Bristol BS8 1UB, Avon, England
基金
英国工程与自然科学研究理事会;
关键词
Boosting; Cost-sensitive; Class imbalance; Classifier calibration;
D O I
10.1007/s10994-016-5572-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We provide a unifying perspective for two decades of work on cost-sensitive Boosting algorithms. When analyzing the literature 1997-2016, we find 15 distinct cost-sensitive variants of the original algorithm; each of these has its own motivation and claims to superiority-so who should we believe? In this work we critique the Boosting literature using four theoretical frameworks: Bayesian decision theory, the functional gradient descent view, margin theory, and probabilistic modelling. Our finding is that only three algorithms are fully supported-and the probabilistic model view suggests that all require their outputs to be calibrated for best performance. Experiments on 18 datasets across 21 degrees of imbalance support the hypothesis-showing that once calibrated, they perform equivalently, and outperform all others. Our final recommendation-based on simplicity, flexibility and performance-is to use the original Adaboost algorithm with a shifted decision threshold and calibrated probability estimates.
引用
收藏
页码:359 / 384
页数:26
相关论文
共 50 条
  • [31] Nanocarrier systems for oral drug delivery: Do we really need them?
    Bernkop-Schnuerch, Andreas
    EUROPEAN JOURNAL OF PHARMACEUTICAL SCIENCES, 2013, 49 (02) : 272 - 277
  • [32] Nitrous oxide cylinders on anaesthetic machines: do we really need them?
    Ahmed, I.
    Majeed, A.
    Javariah, N.
    Dichmont, E.
    ANAESTHESIA, 2009, 64 (06) : 689 - 690
  • [33] Mobile applications for colorectal surgery journals: Do we really need them?
    S. H. Emile
    Techniques in Coloproctology, 2018, 22 : 137 - 138
  • [34] CogBoost: Boosting for Fast Cost-Sensitive Graph Classification
    Pan, Shirui
    Wu, Jia
    Zhu, Xingquan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (11) : 2933 - 2946
  • [35] AdaCC: cumulative cost-sensitive boosting for imbalanced classification
    Iosifidis, Vasileios
    Papadopoulos, Symeon
    Rosenhahn, Bodo
    Ntoutsi, Eirini
    KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (02) : 789 - 826
  • [36] Example-dependent cost-sensitive adaptive boosting
    Zelenkov, Yuri
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 135 : 71 - 82
  • [37] AdaCC: cumulative cost-sensitive boosting for imbalanced classification
    Vasileios Iosifidis
    Symeon Papadopoulos
    Bodo Rosenhahn
    Eirini Ntoutsi
    Knowledge and Information Systems, 2023, 65 : 789 - 826
  • [38] Boosting the Generalized Margin in Cost-Sensitive Multiclass Classification
    Wang, Junhui
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2013, 22 (01) : 178 - 192
  • [39] Reprint of: Nanocarrier systems for oral drug delivery: Do we really need them?
    Bernkop-Schnuerch, Andreas
    EUROPEAN JOURNAL OF PHARMACEUTICAL SCIENCES, 2013, 50 (01) : 2 - 7
  • [40] Soluble and controlled-release preparations of levodopa: do we really need them?
    Fabbrini, Giovanni
    Di Stasio, Flavio
    Bloise, Maria
    Berardelli, Alfredo
    JOURNAL OF NEUROLOGY, 2010, 257 : S292 - S297