GPT-NAS: Neural Architecture Search Meets Generative Pre-Trained Transformer Model

被引:0
|
作者
Yu, Caiyang [1 ]
Liu, Xianggen [1 ]
Wang, Yifan [1 ]
Liu, Yun [1 ]
Feng, Wentao [1 ]
Deng, Xiong [2 ]
Tang, Chenwei [1 ]
Lv, Jiancheng [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci & Engn, Res Ctr Machine Learning & Ind Intelligence, Minist Educ, Chengdu 610065, Peoples R China
[2] Stevens Inst Technol, Dept Mech Engn, Hoboken, NJ 07030 USA
来源
BIG DATA MINING AND ANALYTICS | 2025年 / 8卷 / 01期
关键词
Search problems; Computer architecture; Encoding; Training; Optimization; Data models; Neural networks; Neural Architecture Search (NAS); Generative Pre-trained Transformer (GPT) model; evolutionary algorithm; image classification;
D O I
10.26599/BDMA.2024.9020036
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The pursuit of optimal neural network architectures is foundational to the progression of Neural Architecture Search (NAS). However, the existing NAS methods suffer from the following problem using traditional search strategies, i.e., when facing a large and complex search space, it is difficult to mine more effective architectures within a reasonable time, resulting in inferior search results. This research introduces the Generative Pre-trained Transformer NAS (GPT-NAS), an innovative approach designed to overcome the limitations which are inherent in traditional NAS strategies. This approach improves search efficiency and obtains better architectures by integrating GPT model into the search process. Specifically, we design a reconstruction strategy that utilizes the trained GPT to reorganize the architectures obtained from the search. In addition, to equip the GPT model with the design capabilities of neural architecture, we propose the use of the GPT model for training on a neural architecture dataset. For each architecture, the structural information of its previous layers is utilized to predict the next layer of structure, iteratively traversing the entire architecture. In this way, the GPT model can efficiently learn the key features required for neural architectures. Extensive experimental validation shows that our GPT-NAS approach beats both manually constructed neural architectures and automatically generated architectures by NAS. In addition, we validate the superiority of introducing the GPT model in several ways, and find that the accuracy of the neural architecture on the image dataset obtained from the search after introducing the GPT model is improved by up to about 9%.
引用
收藏
页码:45 / 64
页数:20
相关论文
共 50 条
  • [41] Chat generative pre-trained transformer (ChatGPT): potential implications for rheumatology practice
    Arvind Nune
    Karthikeyan. P. Iyengar
    Ciro Manzo
    Bhupen Barman
    Rajesh Botchu
    Rheumatology International, 2023, 43 : 1379 - 1380
  • [42] Chat Generative Pre-trained Transformer: why we should embrace this technology
    Chavez, Martin R.
    Butler, Thomas S.
    Rekawek, Patricia
    Heo, Hye
    Kinzler, Wendy L.
    AMERICAN JOURNAL OF OBSTETRICS AND GYNECOLOGY, 2023, 228 (06) : 706 - 711
  • [43] The utility of Chat Generative Pre-trained Transformer as a patient resource in paediatric otolaryngology
    Jongbloed, Walter M.
    Grover, Nancy
    JOURNAL OF LARYNGOLOGY AND OTOLOGY, 2024,
  • [44] Quantitative Advancements in Clinical Accuracy of Successive Generative Pre-Trained Transformer Models
    Tate, Hudson
    Hambright, Ben
    Clark, Abby
    Dixon, Cory
    Kronz, Ben
    Ricks, James
    Spaedy, Olivia
    Whalen, Sydney
    Butler, Danner
    Bicknell, Brenton
    JOURNAL OF INVESTIGATIVE MEDICINE, 2024, 72 (06)
  • [45] Chat generative pre-trained transformer (ChatGPT): potential implications for rheumatology practice
    Nune, Arvind
    Iyengar, Karthikeyan. P.
    Manzo, Ciro
    Barman, Bhupen
    Botchu, Rajesh
    RHEUMATOLOGY INTERNATIONAL, 2023, 43 (07) : 1379 - 1380
  • [46] Potential applications of Chat Generative Pre-trained Transformer in obstetrics and gynecology: comment
    Daungsupawong, Hinpetch
    Wiwanitkit, Viroj
    OBSTETRICS & GYNECOLOGY SCIENCE, 2024, 67 (03) : 341 - 342
  • [47] Editorial Commentary: Generative Pre-trained Transformer 4 (GPT4) makes cardiovascular magnetic resonance reports easy to understand
    Banerjee, Imon
    Tariq, Amara
    Chao, Chieh-Ju
    JOURNAL OF CARDIOVASCULAR MAGNETIC RESONANCE, 2024, 26 (01)
  • [48] CAN ARTIFICIAL INTELLIGENCE (AI) LARGE LANGUAGE MODELS (LLMS) SUCH AS GENERATIVE PRE-TRAINED TRANSFORMER (GPT) BE USED TO AUTOMATE LITERATURE REVIEWS?
    Guerra, I
    Gallinaro, J.
    Rtveladze, K.
    Lambova, A.
    Asenova, E.
    VALUE IN HEALTH, 2023, 26 (12) : S410 - S411
  • [49] FLUID-GPT (Fast Learning to Understand and Investigate Dynamics with a Generative Pre-Trained Transformer): Efficient Predictions of Particle Trajectories and Erosion
    Yang, Steve D.
    Ali, Zulfikhar A.
    Wong, Bryan M.
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2023, 62 (37) : 15278 - 15289
  • [50] Synergizing Generative Pre-Trained Transformer (GPT) Chatbots in a Process-Based Writing Paradigm to Enhance University Students' Writing Skill
    Robillos, Roderick Julian
    JOURNAL OF LANGUAGE AND EDUCATION, 2024, 10 (03): : 79 - 94