SigOpt Mulch: An intelligent system for AutoML of gradient boosted trees

被引:1
|
作者
Sorokin, Aleksei [1 ]
Zhu, Xinran [2 ]
Lee, Eric Hans [3 ]
Cheng, Bolong [3 ]
机构
[1] IIT, Chicago, IL USA
[2] Cornell Univ, Ithaca, NY USA
[3] SigOpt Intel Co, San Francisco, CA 94104 USA
关键词
Automated machine learning (autoML); Hyperparameter optimization (HPO); Gradient boosted trees; EFFICIENT;
D O I
10.1016/j.knosys.2023.110604
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gradient boosted trees (GBTs) are ubiquitous models used by researchers, machine learning (ML) practitioners, and data scientists because of their robust performance, interpretable behavior, and ease-of-use. One critical challenge in training GBTs is the tuning of their hyperparameters. In practice, selecting these hyperparameters is often done manually. Recently, the ML community has advocated for tuning hyperparameters through black-box optimization and developed state-of-the-art systems to do so. However, applying such systems to tune GBTs suffers from two drawbacks. First, these systems are not model-aware, rather they are designed to apply to a generic model; this leaves significant optimization performance on the table. Second, using these systems requires domain knowledge such as the choice of hyperparameter search space, which is an antithesis to the automatic experimentation that black-box optimization aims to provide. In this paper, we present SigOpt Mulch, a model-aware hyperparameter tuning system specifically designed for automated tuning of GBTs that provides two improvements over existing systems. First, Mulch leverages powerful techniques in metalearning and multifidelity optimization to perform model-aware hyperparameter optimization. Second, it automates the process of learning performant hyperparameters by making intelligent decisions about the optimization search space, thus reducing the need for user domain knowledge. These innovations allow Mulch to identify good GBT hyperparameters far more efficiently-and in a more seamless and user-friendly way-than existing black-box hyperparameter tuning systems. (c) 2023 Published by Elsevier B.V.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] TF Boosted Trees: A Scalable TensorFlow Based Framework for Gradient Boosting
    Ponomareva, Natalia
    Radpour, Soroush
    Hendry, Gilbert
    Haykal, Salem
    Colthurst, Thomas
    Mitrichev, Petr
    Grushetsky, Alexander
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT III, 2017, 10536 : 423 - 427
  • [32] Waist circumference prediction for epidemiological research using gradient boosted trees
    Weihong Zhou
    Spencer Eckler
    Andrew Barszczyk
    Alex Waese-Perlman
    Yingjie Wang
    Xiaoping Gu
    Zhong-Ping Feng
    Yuzhu Peng
    Kang Lee
    BMC Medical Research Methodology, 21
  • [33] Instance-Based Uncertainty Estimation for Gradient-Boosted Regression Trees
    Brophy, Jonathan
    Lowd, Daniel
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [34] PredRSA: a gradient boosted regression trees approach for predicting protein solvent accessibility
    Fan, Chao
    Liu, Diwei
    Huang, Rui
    Chen, Zhigang
    Deng, Lei
    BMC BIOINFORMATICS, 2016, 17
  • [35] Verifying the Value and Veracity of eXtreme Gradient Boosted Decision Trees on a Variety of Datasets
    Gupta, Aditya
    Gusain, Kunal
    Popli, Bhavya
    2016 11TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2016, : 457 - 462
  • [36] Inferring Gene Regulatory Networks of Metabolic Enzymes Using Gradient Boosted Trees
    Zhang, Yi
    Zhang, Xiaofei
    Lane, Andrew N.
    Fan, Teresa W-M
    Liu, Jinze
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2020, 24 (05) : 1528 - 1536
  • [37] Optimising pin-in-paste technology using gradient boosted decision trees
    Martinek, Peter
    Krammer, Oliver
    SOLDERING & SURFACE MOUNT TECHNOLOGY, 2018, 30 (03) : 164 - 170
  • [38] Learning to predict soccer results from relational data with gradient boosted trees
    Hubacek, Ondrej
    Sourek, Gustav
    Zelezny, Filip
    MACHINE LEARNING, 2019, 108 (01) : 29 - 47
  • [39] Gradient boosted trees for spatial data and its application to medical imaging data
    Iranzad, Reza
    Liu, Xiao
    Chaovalitwongse, W. Art
    Hippe, Daniel
    Wang, Shouyi
    Han, Jie
    Thammasorn, Phawis
    Zeng, Jing
    Duan, Chunyan
    Bowen, Stephen
    IISE TRANSACTIONS ON HEALTHCARE SYSTEMS ENGINEERING, 2022, 12 (03) : 165 - 179
  • [40] Learning to predict soccer results from relational data with gradient boosted trees
    Ondřej Hubáček
    Gustav Šourek
    Filip Železný
    Machine Learning, 2019, 108 : 29 - 47