MODEL SPIDER: Learning to Rank Pre-Trained Models Efficiently

被引:0
|
作者
Zhang, Yi-Kai [1 ]
Huang, Ting-Ji [1 ]
Ding, Yao-Xiang [2 ]
Zhan, De-Chuan [1 ]
Ye, Han-Jia [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Zhejiang Univ, State Key Lab CAD & CG, Hangzhou, Peoples R China
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Figuring out which Pre-Trained Model (PTM) from a model zoo fits the target task is essential to take advantage of plentiful model resources. With the availability of numerous heterogeneous PTMs from diverse fields, efficiently selecting the most suitable one is challenging due to the time-consuming costs of carrying out forward or backward passes over all PTMs. In this paper, we propose MODEL SPIDER, which tokenizes both PTMs and tasks by summarizing their characteristics into vectors to enable efficient PTM selection. By leveraging the approximated performance of PTMs on a separate set of training tasks, MODEL SPIDER learns to construct representation and measure the fitness score between a model-task pair via their representation. The ability to rank relevant PTMs higher than others generalizes to new tasks. With the top-ranked PTM candidates, we further learn to enrich task repr. with their PTM-specific semantics to re-rank the PTMs for better selection. MODEL SPIDER balances efficiency and selection ability, making PTM selection like a spider preying on a web. MODEL SPIDER exhibits promising performance across diverse model zoos, including visual models and Large Language Models (LLMs). Code is available at https://github.com/zhangyikaii/Model-Spider.
引用
收藏
页数:28
相关论文
共 50 条
  • [31] Classification of Regional Food Using Pre-Trained Transfer Learning Models
    Gadhiya, Jeet
    Khatik, Anjali
    Kodinariya, Shruti
    Ramoliya, Dipak
    7th International Conference on Electronics, Communication and Aerospace Technology, ICECA 2023 - Proceedings, 2023, : 1237 - 1241
  • [32] Simple and Effective Multimodal Learning Based on Pre-Trained Transformer Models
    Miyazawa, Kazuki
    Kyuragi, Yuta
    Nagai, Takayuki
    IEEE ACCESS, 2022, 10 : 29821 - 29833
  • [33] Classification and Analysis of Pistachio Species with Pre-Trained Deep Learning Models
    Singh, Dilbag
    Taspinar, Yavuz Selim
    Kursun, Ramazan
    Cinar, Ilkay
    Koklu, Murat
    Ozkan, Ilker Ali
    Lee, Heung-No
    ELECTRONICS, 2022, 11 (07)
  • [34] Reinforced Curriculum Learning on Pre-Trained Neural Machine Translation Models
    Zhao, Mingjun
    Wu, Haijiang
    Niu, Di
    Wang, Xiaoli
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9652 - 9659
  • [35] Unsupervised Representation Learning from Pre-trained Diffusion Probabilistic Models
    Zhang, Zijian
    Zhao, Zhou
    Lin, Zhijie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [36] Pre-trained deep learning models for brain MRI image classification
    Krishnapriya, Srigiri
    Karuna, Yepuganti
    FRONTIERS IN HUMAN NEUROSCIENCE, 2023, 17
  • [37] Federated Learning of Models Pre-Trained on Different Features with Consensus Graphs
    Ma, Tengfei
    Hoang, Trong Nghia
    Chen, Jie
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 1336 - 1346
  • [38] Learning Sample Difficulty from Pre-trained Models for Reliable Prediction
    Cui, Peng
    Zhang, Dan
    Deng, Zhijie
    Dong, Yinpeng
    Zhu, Jun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [39] Adaptive Textual Label Noise Learning based on Pre-trained Models
    Cheng, Shaohuan
    Chen, Wenyu
    Fu, Mingsheng
    Xie, Xuanting
    Qu, Hong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3174 - 3188
  • [40] ContraBERT: Enhancing Code Pre-trained Models via Contrastive Learning
    Liu, Shangqing
    Wu, Bozhi
    Xie, Xiaofei
    Meng, Guozhu
    Liu, Yang
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 2476 - 2487