Artificial intelligence foundation and pre-trained models: Fundamentals, applications, opportunities, and social impacts

被引:29
|
作者
Kolides, Adam [1 ]
Nawaz, Alyna [1 ]
Rathor, Anshu [1 ]
Beeman, Denzel [1 ]
Hashmi, Muzammil [1 ]
Fatima, Sana [1 ]
Berdik, David [1 ]
Al-Ayyoub, Mahmoud [2 ]
Jararweh, Yaser [1 ]
机构
[1] Duquesne Univ, Pittsburgh, PA USA
[2] Jordan Univ Sci & Technol, Irbid, Jordan
关键词
Pre-trained models; Self-supervised learning; Natural Language Processing; Computer vision; Image processing; Transformers; Machine learning models; Foundation models in robotics; Transfer learning; In-context learning; Self-attention; Fine-tuning;
D O I
10.1016/j.simpat.2023.102754
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the emergence of foundation models (FMs) that are trained on large amounts of data at scale and adaptable to a wide range of downstream applications, AI is experiencing a paradigm revolution. BERT, T5, ChatGPT, GPT-3, Codex, DALL-E, Whisper, and CLIP are now the foundation for new applications ranging from computer vision to protein sequence study and from speech recognition to coding. Earlier models had a reputation of starting from scratch with each new challenge. The capacity to experiment with, examine, and comprehend the capabilities and potentials of next-generation FMs is critical to undertaking this research and guiding its path. Nevertheless, these models are currently inaccessible as the resources required to train these models are highly concentrated in industry, and even the assets (data, code) required to replicate their training are frequently not released due to their demand in the real-time industry. At the moment, only large tech companies such as OpenAI, Google, Facebook, and Baidu can afford to construct FMs. We attempt to analyze and examine the main capabilities, key implementations, technological fundamentals, and socially constructed possible consequences of these models inside this research. Despite the expected widely publicized use of FMs, we still lack a comprehensive knowledge of how they operate, why they underperform, and what they are even capable of because of their emerging global qualities. To deal with these problems, we believe that much critical research on FMs would necessitate extensive multidisciplinary collaboration, given their essentially social and technical structure. Throughout the investigation, we will also have to deal with the problem of misrepresentation created by these systems. If FMs live up to their promise, AI might see far wider commercial use. As researchers studying the ramifications on society, we believe FMs will lead the way in massive changes. They are closely managed for the time being, so we should have time to comprehend their implications before they become a major concern.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Natural Attack for Pre-trained Models of Code
    Yang, Zhou
    Shi, Jieke
    He, Junda
    Lo, David
    2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 1482 - 1493
  • [22] Generalization of vision pre-trained models for histopathology
    Milad Sikaroudi
    Maryam Hosseini
    Ricardo Gonzalez
    Shahryar Rahnamayan
    H. R. Tizhoosh
    Scientific Reports, 13
  • [23] Knowledge Rumination for Pre-trained Language Models
    Yao, Yunzhi
    Wang, Peng
    Mao, Shengyu
    Tan, Chuanqi
    Huang, Fei
    Chen, Huajun
    Zhang, Ningyu
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3387 - 3404
  • [24] Pre-trained models: Past, present and future
    Han, Xu
    Zhang, Zhengyan
    Ding, Ning
    Gu, Yuxian
    Liu, Xiao
    Huo, Yuqi
    Qiu, Jiezhong
    Yao, Yuan
    Zhang, Ao
    Zhang, Liang
    Han, Wentao
    Huang, Minlie
    Jin, Qin
    Lan, Yanyan
    Liu, Yang
    Liu, Zhiyuan
    Lu, Zhiwu
    Qiu, Xipeng
    Song, Ruihua
    Tang, Jie
    Wen, Ji-Rong
    Yuan, Jinhui
    Zhao, Wayne Xin
    Zhu, Jun
    AI OPEN, 2021, 2 : 225 - 250
  • [25] HinPLMs: Pre-trained Language Models for Hindi
    Huang, Xixuan
    Lin, Nankai
    Li, Kexin
    Wang, Lianxi
    Gan, Suifu
    2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 241 - 246
  • [26] Deep Compression of Pre-trained Transformer Models
    Wang, Naigang
    Liu, Chi-Chun
    Venkataramani, Swagath
    Sen, Sanchari
    Chen, Chia-Yu
    El Maghraoui, Kaoutar
    Srinivasan, Vijayalakshmi
    Chang, Leland
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [27] Evaluating Commonsense in Pre-Trained Language Models
    Zhou, Xuhui
    Zhang, Yue
    Cui, Leyang
    Huang, Dandan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9733 - 9740
  • [28] Semantic Programming by Example with Pre-trained Models
    Verbruggen, Gust
    Le, Vu
    Gulwani, Sumit
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2021, 5 (OOPSLA):
  • [29] Improving the Reusability of Pre-trained Language Models in Real-world Applications
    Ghanbarzadeh, Somayeh
    Palangi, Hamid
    Huang, Yan
    Moreno, Radames Cruz
    Khanpour, Hamed
    2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, : 40 - 45
  • [30] Predicting social media users' indirect aggression through pre-trained models
    Zhou, Zhenkun
    Yu, Mengli
    Peng, Xingyu
    He, Yuxin
    PEERJ COMPUTER SCIENCE, 2024, 10