ZARTS: On Zero-order Optimization for Neural Architecture Search

被引:0
|
作者
Wang, Xiaoxing [1 ]
Guo, Wenxuan [1 ]
Su, Jianlin [2 ]
Yang, Xiaokang [1 ]
Yan, Junchi [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China
[2] Shenzhen Zhuiyi Technol Co Ltd, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Differentiable architecture search (DARTS) has been a popular one-shot paradigm for NAS due to its high efficiency. It introduces trainable architecture parameters to represent the importance of candidate operations and proposes first/second-order approximation to estimate their gradients, making it possible to solve NAS by gradient descent algorithm. However, our in-depth empirical results show that the approximation often distorts the loss landscape, leading to the biased objective to optimize and, in turn, inaccurate gradient estimation for architecture parameters. This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation. Specifically, three representative zero-order optimization methods are introduced: RS, MGS, and GLD, among which MGS performs best by balancing the accuracy and speed. Moreover, we explore the connections between RS/MGS and the gradient descent algorithm and show that our ZARTS can be seen as a robust gradient-free counterpart to DARTS. Extensive experiments on multiple datasets and search spaces show the remarkable performance of our method. In particular, results on 12 benchmarks verify the outstanding robustness of ZARTS, where the performance of DARTS collapses due to its known instability issue. Also, we search on the search space of DARTS to compare with peer methods, and our discovered architecture achieves 97.54% accuracy on CIFAR-10 and 75.7% top-1 accuracy on ImageNet. Finally, we combine our ZARTS with three orthogonal variants of DARTS for faster search speed and better performance. Source code will be made publicly available at: https://github.com/vicFigure/ZARTS.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Distributed Zero-Order Algorithms for Nonconvex Multiagent Optimization
    Tang, Yujie
    Zhang, Junshan
    Li, Na
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2021, 8 (01): : 269 - 281
  • [2] Distributed Zero-Order Optimization under Adversarial Noise
    Akhavan, Arya
    Pontil, Massimiliano
    Tsybakov, Alexandre B.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] A Fish Rheotaxis Mechanism as a Zero-Order Optimization Strategy
    Burbano, Daniel
    Yousefian, Farzad
    IEEE ACCESS, 2023, 11 : 102781 - 102795
  • [4] ZERO-ORDER KINETICS
    ENTWHIST.PA
    FOSTER, RL
    CLINICAL CHEMISTRY, 1974, 20 (09) : 1245 - 1246
  • [5] Artificial neural networks are zero-order TSK fuzzy systems
    Mantas, Carlos J.
    Puche, Jose A.
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2008, 16 (03) : 630 - 643
  • [6] The optimization of zero-order diffractive filters for security imaging applications
    Zhang, Dawei
    Huang, Yuanshen
    Ni, Zhengji
    Chen, Jiabi
    Zhuang, Songlin
    He, Hongbo
    THIN FILM PHYSICS AND APPLICATIONS, SIXTH INTERNATIONAL CONFERENCE, 2008, 6984
  • [7] Temperature optimization model to inhibit zero-order kinetic reactions
    Januardi, Januardi
    Nugraha, Aditya Sukma
    CHEMICAL PRODUCT AND PROCESS MODELING, 2024, 19 (04): : 619 - 630
  • [8] Zero Tolerance, Zero-Order Responders
    Cutter, Susan L.
    ENVIRONMENT, 2018, 60 (05): : 2 - 3
  • [9] EnNet: Enhanced Interactive Information Network with Zero-Order Optimization
    Shao, Yingzhao
    Chen, Yanxin
    Yang, Pengfei
    Cheng, Fei
    SENSORS, 2024, 24 (19)
  • [10] Zero-Order Optimization-Based Iterative Learning Control
    Baumgaertner, Katrin
    Diehl, Moritz
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 3751 - 3757