Efficient Hyper-parameter Optimization with Cubic Regularization

被引:0
|
作者
Shen, Zhenqian [1 ]
Yang, Hansi [2 ]
Li, Yong [1 ]
Kwok, James [2 ]
Yao, Quanming [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As hyper-parameters are ubiquitous and can significantly affect the model performance, hyper-parameter optimization is extremely important in machine learning. In this paper, we consider a sub-class of hyper-parameter optimization problems, where the hyper-gradients are not available. Such problems frequently appear when the performance metric is non-differentiable or the hyper-parameter is not continuous. However, existing algorithms, like Bayesian optimization and reinforcement learning, often get trapped in local optimals with poor performance. To address the above limitations, we propose to use cubic regularization to accelerate convergence and avoid saddle points. First, we adopt stochastic relaxation, which allows obtaining gradient and Hessian information without hyper-gradients. Then, we exploit the rich curvature information by cubic regularization. Theoretically, we prove that the proposed method can converge to approximate second-order stationary points, and the convergence is also guaranteed when the lower-level problem is inexactly solved. Experiments on synthetic and real-world data demonstrate the effectiveness of our proposed method.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Hyper-parameter optimization tools comparison for multiple object tracking applications
    Francisco Madrigal
    Camille Maurice
    Frédéric Lerasle
    Machine Vision and Applications, 2019, 30 : 269 - 289
  • [42] Ultron-AutoML: an open-source, distributed, scalable framework for efficient hyper-parameter optimization
    Narayan, Swarnim
    Krishna, Chepuri Shri
    Mishra, Varun
    Rai, Abhinav
    Rai, Himanshu
    Bharti, Chandrakant
    Sodhi, Gursirat Singh
    Gupta, Ashish
    Singh, Nitinbalaji
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 1584 - 1593
  • [43] An Experimental Study on Hyper-parameter Optimization for Stacked Auto-Encoders
    Sun, Yanan
    Xue, Bing
    Zhang, Mengjie
    Yen, Gary G.
    2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, : 638 - 645
  • [44] A new hyper-parameter optimization method for machine learning in fault classification
    Ye, Xingchen
    Gao, Liang
    Li, Xinyu
    Wen, Long
    APPLIED INTELLIGENCE, 2023, 53 (11) : 14182 - 14200
  • [45] Image classification based on KPCA and SVM with randomized hyper-parameter optimization
    Li, Lin
    Lian, Jin
    Wu, Yue
    Ye, Mao
    International Journal of Signal Processing, Image Processing and Pattern Recognition, 2014, 7 (04) : 303 - 316
  • [46] Hyper-parameter optimization for improving the performance of localization in an iterative ensemble smoother
    Luo, Xiaodong
    Cruz, William C.
    Zhang, Xin-Lei
    Xiao, Heng
    GEOENERGY SCIENCE AND ENGINEERING, 2023, 231
  • [47] Facilitating Database Tuning with Hyper-Parameter Optimization: A Comprehensive Experimental Evaluation
    Zhang, Xinyi
    Chang, Zhuo
    Li, Yang
    Wu, Hong
    Tan, Jian
    Li, Feifei
    Cui, Bin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (09): : 1808 - 1821
  • [48] Particle Swarm Optimization for Hyper-Parameter Selection in Deep Neural Networks
    Lorenzo, Pablo Ribalta
    Nalepa, Jakub
    Kawulok, Michal
    Sanchez Ramos, Luciano
    Ranilla Pastor, Jose
    PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'17), 2017, : 481 - 488
  • [49] Hyper-parameter Recommendation for Truth Discovery
    Chen, Siying
    Ding, Xiaoou
    Liang, Zheng
    Tang, Yafeng
    Wang, Hongzhi
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 3, 2025, 14852 : 277 - 292
  • [50] A new hyper-parameter optimization method for machine learning in fault classification
    Xingchen Ye
    Liang Gao
    Xinyu Li
    Long Wen
    Applied Intelligence, 2023, 53 : 14182 - 14200