AI Computing Systems for Large Language Models Training

被引:0
|
作者
Zhang, Zhen-Xing [1 ,2 ]
Wen, Yuan-Bo [2 ]
Lyu, Han-Qi [1 ,2 ,3 ]
Liu, Chang [3 ]
Zhang, Rui [2 ]
Li, Xia-Qing [2 ]
Wang, Chao [1 ]
Du, Zi-Dong [2 ,4 ]
Guo, Qi [2 ]
Li, Ling [5 ]
Zhou, Xue-Hai [1 ]
Chen, Yun-Ji [2 ,6 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230026, Peoples R China
[2] Chinese Acad Sci, Inst Comp Technol, State Key Lab Processors, Beijing 100190, Peoples R China
[3] Cambricon Technol, Beijing 100191, Peoples R China
[4] Shanghai Innovat Ctr Processor Technol, Shanghai 201210, Peoples R China
[5] Chinese Acad Sci, Inst Software, Intelligent Software Res Ctr, Beijing 100190, Peoples R China
[6] Univ Chinese Acad Sci, Beijing 101408, Peoples R China
基金
中国国家自然科学基金;
关键词
artificial intelligence (AI) chip; large language model (LLM); AI computing system; accelerator; EFFICIENT;
D O I
10.1007/s11390-024-4178-1
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a comprehensive overview of artificial intelligence (AI) computing systems for large language models (LLMs) training. The rapid advancement of LLMs in recent years, coupled with the widespread adoption of algorithms and applications such as BERT, ChatGPT, and DeepSeek, has sparked significant interest in this field. We classify LLMs into encoder-only, encoder-decoder, and decoder-only models, and briefly analyze their training and inference processes to emphasize their substantial need for computational resources. These operations depend heavily on AI-specific accelerators like GPUs (graphics processing units), TPUs (tensor processing units), and MLUs (machine learning units). However, as the gap widens between the increasing complexity of LLMs and the current capabilities of accelerators, it becomes essential to adopt heterogeneous computing systems optimized for distributed environments to manage the growing computational and memory requirements of LLMs. We delve into the execution and scheduling of LLM algorithms, underlining the critical role of distributed computing strategies, memory management enhancements, and boosting computational efficiency. This paper clarifies the complex relationship between algorithm design, hardware infrastructure, and software optimization, and provides an in-depth understanding of both the software and hardware infrastructure supporting LLMs training, offering insights into the challenges and potential avenues for future development and deployment.
引用
收藏
页码:6 / 41
页数:36
相关论文
共 50 条
  • [41] Accelerating Contextualization in AI Large Language Models Using Vector Databases
    Bin Tareaf, Raad
    AbuJarour, Mohammed
    Engelman, Tom
    Liermann, Philipp
    Klotz, Jesse
    38TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN 2024, 2024, : 316 - 321
  • [42] The Future of AI in Ovarian Cancer Research: The Large Language Models Perspective
    Laios, Alexandros
    Theophilou, Georgios
    De Jong, Diederick
    Kalampokis, Evangelos
    CANCER CONTROL, 2023, 30
  • [43] Large Language Models Empowered Autonomous Edge AI for Connected Intelligence
    Shen, Yifei
    Shao, Jiawei
    Zhang, Xinjie
    Lin, Zehong
    Pan, Hao
    Li, Dongsheng
    Zhang, Jun
    Letaief, Khaled B.
    IEEE COMMUNICATIONS MAGAZINE, 2024, 62 (10) : 140 - 146
  • [44] Promptology: Enhancing Human-AI Interaction in Large Language Models
    Olla, Phillip
    Elliott, Lauren
    Abumeeiz, Mustafa
    Mihelich, Karen
    Olson, Joshua
    INFORMATION, 2024, 15 (10)
  • [45] The imperative for regulatory oversight of large language models (or generative AI) in healthcare
    Bertalan Meskó
    Eric J. Topol
    npj Digital Medicine, 6
  • [46] Generative AI and large language models in health care: pathways to implementation
    Marium M. Raza
    Kaushik P. Venkatesh
    Joseph C. Kvedar
    npj Digital Medicine, 7
  • [47] Embracing the AI revolution with open large language models in anatomy education
    Ray, Partha Pratim
    SURGICAL AND RADIOLOGIC ANATOMY, 2024, 46 (07) : 949 - 950
  • [48] Generative AI and large language models in health care: pathways to implementation
    Raza, Marium M.
    Venkatesh, Kaushik P.
    Kvedar, Joseph C.
    NPJ DIGITAL MEDICINE, 2024, 7 (01)
  • [49] Getting pwn'd by AI: Penetration Testing with Large Language Models
    Happe, Andreas
    Cito, Juergen
    PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 2082 - 2086
  • [50] AI am a rheumatologist: a practical primer to large language models for rheumatologists
    Venerito, Vincenzo
    Bilgin, Emre
    Iannone, Florenzo
    Kiraz, Sedat
    RHEUMATOLOGY, 2023, 62 (10) : 3256 - 3260