A comparative analysis of gradient boosting algorithms

被引:0
|
作者
Candice Bentéjac
Anna Csörgő
Gonzalo Martínez-Muñoz
机构
[1] University of Bordeaux,College of Science and Technology
[2] Pázmány Péter Catholic University,Faculty of Information Technology and Bionics
[3] Universidad Autónoma de Madrid,Escuela Politéctica Superior
来源
关键词
XGBoost; LightGBM; CatBoost; Gradient boosting; Random forest; Ensembles of classifiers;
D O I
暂无
中图分类号
学科分类号
摘要
The family of gradient boosting algorithms has been recently extended with several interesting proposals (i.e. XGBoost, LightGBM and CatBoost) that focus on both speed and accuracy. XGBoost is a scalable ensemble technique that has demonstrated to be a reliable and efficient machine learning challenge solver. LightGBM is an accurate model focused on providing extremely fast training performance using selective sampling of high gradient instances. CatBoost modifies the computation of gradients to avoid the prediction shift in order to improve the accuracy of the model. This work proposes a practical analysis of how these novel variants of gradient boosting work in terms of training speed, generalization performance and hyper-parameter setup. In addition, a comprehensive comparison between XGBoost, LightGBM, CatBoost, random forests and gradient boosting has been performed using carefully tuned models as well as using their default settings. The results of this comparison indicate that CatBoost obtains the best results in generalization accuracy and AUC in the studied datasets although the differences are small. LightGBM is the fastest of all methods but not the most accurate. Finally, XGBoost places second both in accuracy and in training speed. Finally an extensive analysis of the effect of hyper-parameter tuning in XGBoost, LightGBM and CatBoost is carried out using two novel proposed tools.
引用
收藏
页码:1937 / 1967
页数:30
相关论文
共 50 条
  • [21] Estimating the penetration rate of tunnel boring machines via gradient boosting algorithms
    Ghorbani, Ebrahim
    Yagiz, Saffet
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 136
  • [22] A comparative study of ensemble methods in the field of education: Bagging and Boosting algorithms
    Sevgin, Hikmet
    INTERNATIONAL JOURNAL OF ASSESSMENT TOOLS IN EDUCATION, 2023, 10 (03): : 544 - 562
  • [23] GBRTVis: online analysis of gradient boosting regression tree
    Huang, Yifei
    Liu, Yuhua
    Li, Chenhui
    Wang, Changbo
    JOURNAL OF VISUALIZATION, 2019, 22 (01) : 125 - 140
  • [24] Railroad accident analysis using extreme gradient boosting
    Bridgelall, Raj
    Tolliver, Denver D.
    ACCIDENT ANALYSIS AND PREVENTION, 2021, 156
  • [25] GBRTVis: online analysis of gradient boosting regression tree
    Yifei Huang
    Yuhua Liu
    Chenhui Li
    Changbo Wang
    Journal of Visualization, 2019, 22 : 125 - 140
  • [26] Overfitting of boosting and regularized boosting algorithms
    Onoda, Takashi
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2007, 90 (09): : 69 - 78
  • [27] Gradient algorithms for principal component analysis
    Mahony, RE
    Helmke, U
    Moore, JB
    JOURNAL OF THE AUSTRALIAN MATHEMATICAL SOCIETY SERIES B-APPLIED MATHEMATICS, 1996, 37 : 430 - 450
  • [28] A Variational Analysis of Stochastic Gradient Algorithms
    Mandt, Stephan
    Hoffman, Matthew D.
    Blei, David M.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [29] A comparative study of machine learning models for construction costs prediction with natural gradient boosting algorithm and SHAP analysis
    Das P.
    Kashem A.
    Hasan I.
    Islam M.
    Asian Journal of Civil Engineering, 2024, 25 (4) : 3301 - 3316
  • [30] Predictive Performances of Ensemble Machine Learning Algorithms in Landslide Susceptibility Mapping Using Random Forest, Extreme Gradient Boosting (XGBoost) and Natural Gradient Boosting (NGBoost)
    Kavzoglu, Taskin
    Teke, Alihan
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (06) : 7367 - 7385