On weak base hypotheses and their implications for boosting regression and classification

被引:24
|
作者
Jiang, WX [1 ]
机构
[1] Northwestern Univ, Dept Stat, Evanston, IL 60208 USA
来源
ANNALS OF STATISTICS | 2002年 / 30卷 / 01期
关键词
angular span; boosting; classification; error bounds; least squares regression; matching pursuit; nearest neighbor rule; overfit; prediction error; regularization; training error; weak hypotheses;
D O I
10.1214/aos/1015362184
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
When studying the training error and the prediction error for boosting, it is often assumed that the hypotheses returned by the base learner are weakly accurate, or are able to beat a random guesser by a certain amount of difference. It has been an open question how much this difference can be. whether it will eventually disappear in the boosting process or be bounded by a positive amount. This question is crucial for the behavior of both the training error and the prediction error. In this paper we study this problem and show affirmatively that the amount of improvement over the random guesser will be at least a positive amount for almost all possible sample realizations and for most of the commonly used base hypotheses. This has a number of implications for the prediction error, including, for example, that boosting forever may not be good and regularization may be necessary. The problem is studied by first considering an analog of AdaBoost in regression, where we study similar properties and find that, for good performance, one cannot hope to avoid regularization by just adopting the boosting device to regression.
引用
收藏
页码:51 / 73
页数:23
相关论文
共 50 条
  • [41] Boosting distributional copula regression
    Hans, Nicolai
    Klein, Nadja
    Faschingbauer, Florian
    Schneider, Michael
    Mayr, Andreas
    BIOMETRICS, 2023, 79 (03) : 2298 - 2310
  • [42] Boosting and instability for regression trees
    Gey, S
    Poggi, JM
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (02) : 533 - 550
  • [43] Boosting methodology for regression problems
    Ridgeway, G
    Madigan, D
    Richardson, T
    ARTIFICIAL INTELLIGENCE AND STATISTICS 99, PROCEEDINGS, 1999, : 152 - 161
  • [44] Pinball boosting of regression quantiles
    Bauer, Ida
    Haupt, Harry
    Linner, Stefan
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2024, 200
  • [45] Boosting kernel models for regression
    Sun, Ping
    Yao, Xin
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 583 - +
  • [46] Boosting diversity in regression ensembles
    Bourel, Mathias
    Cugliari, Jairo
    Goude, Yannig
    Poggi, Jean-Michel
    STATISTICAL ANALYSIS AND DATA MINING, 2024, 17 (01)
  • [47] Structured Regression Gradient Boosting
    Diego, Ferran
    Hamprecht, Fred A.
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1459 - 1467
  • [48] Symbolic-regression boosting
    Moshe Sipper
    Jason H. Moore
    Genetic Programming and Evolvable Machines, 2021, 22 : 357 - 381
  • [49] Robust boosting for regression problems
    Ju, Xiaomeng
    Salibian-Barrera, Matias
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2021, 153
  • [50] Robust regression by boosting the median
    Kégl, B
    LEARNING THEORY AND KERNEL MACHINES, 2003, 2777 : 258 - 272