SMOTEBoost for Regression: Improving the Prediction of Extreme Values

被引:25
|
作者
Moniz, Nuno [1 ]
Ribeiro, Rita P. [1 ]
Cerqueira, Vitor [1 ]
Chawla, Nitesh [2 ]
机构
[1] Univ Porto, INESC TEC, Porto, Portugal
[2] Univ Notre Dame, Indiana, PA USA
关键词
Imbalanced Domain Learning; Ensemble Learning; Boosting; Regression;
D O I
10.1109/DSAA.2018.00025
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Supervised learning with imbalanced domains is one of the biggest challenges in machine learning. Such tasks differ from standard learning tasks by assuming a skewed distribution of target variables, and user domain preference towards under-represented cases. Most research has focused on imbalanced classification tasks, where a wide range of solutions has been tested. Still, little work has been done concerning imbalanced regression tasks. In this paper, we propose an adaptation of the SMOTEBoost approach for the problem of imbalanced regression. Originally designed for classification tasks, it combines boosting methods and the SMOTE resampling strategy. We present four variants of SMOTEBoost and provide an experimental evaluation using 30 datasets with an extensive analysis of results in order to assess the ability of SMOTEBoost methods in predicting extreme target values, and their predictive trade-off concerning baseline boosting methods. SMOTEBoost is publicly available in a software package.
引用
收藏
页码:150 / 159
页数:10
相关论文
共 50 条
  • [11] Improving precipitation forecasts using extreme quantile regression
    Jasper Velthoen
    Juan-Juan Cai
    Geurt Jongbloed
    Maurice Schmeits
    Extremes, 2019, 22 : 599 - 622
  • [12] Inference with Extremes: Accounting for Extreme Values in Count Regression Models
    Randahl, David
    Vegelius, Johan
    INTERNATIONAL STUDIES QUARTERLY, 2024, 68 (04)
  • [13] Prediction of extreme PM2.5 concentrations via extreme quantile regression
    Lee, SangHyuk
    Park, Seoncheol
    Lim, Yaeji
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2022, 29 (03) : 319 - 331
  • [14] Rule-based prediction of rare extreme values
    Ribeiro, Rita
    Torgo, Luis
    DISCOVERY SCIENCE, PROCEEDINGS, 2006, 4265 : 219 - 230
  • [15] A Hybrid Regression Model for Improving Prediction Accuracy
    Poojari, Satyanarayana
    Ismail, B.
    ELECTRONIC JOURNAL OF APPLIED STATISTICAL ANALYSIS, 2023, 16 (03) : 784 - 801
  • [16] Improving Software Fault Prediction with Threshold Values
    Shatnawi, Raed
    2018 26TH INTERNATIONAL CONFERENCE ON SOFTWARE, TELECOMMUNICATIONS AND COMPUTER NETWORKS (SOFTCOM), 2018, : 193 - 198
  • [17] ESTIMATION AND PREDICTION OF THE DISTRIBUTION OF EXTREME VALUES ATTRACTED BY A FRECHET DISTRIBUTION
    DIAZ, JP
    COMPTES RENDUS DE L ACADEMIE DES SCIENCES SERIE I-MATHEMATIQUE, 1985, 301 (10): : 541 - 544
  • [18] Prediction of most probable extreme values for jackup dynamic analysis
    Lu, YJ
    Chen, YN
    Tan, PL
    Bai, Y
    MARINE STRUCTURES, 2002, 15 (01) : 15 - 34
  • [19] Extreme Values Theory and Return Level Analysis for Catastrophe Prediction
    Kapoor, Amitesh
    Shrivastava, Utkarsh
    JOURNAL OF INVESTING, 2014, 23 (02): : 124 - 135
  • [20] Gumbel Distribution Adjustment Improvement for Accurate Extreme Values Prediction
    Khabou, Nesrine
    Rodriguez, Ismael Bouassida
    Jameleddine, Oumayma
    Mateur, Amal
    2021 IEEE/ACS 18TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2021,