Forward Learning of Large Language Models by Consumer Devices

被引：3

作者：

Pau, Danilo Pietro ^{[1
]}

Aymone, Fabrizio Maria ^{[1
]}

机构：

[1] STMicroelectronics, Syst Res & Applicat, Via C Olivetti 2, I-20864 Agrate Brianza, Italy

来源：

ELECTRONICS | 2024年 / 13卷 / 02期

关键词：

on-device learning; backpropagation; forward learning; PEPITA; MEMPEPITA; Large Language Models; Natural Language Processing;

D O I：

10.3390/electronics13020402

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Large Language Models achieve state of art performances on a broad variety of Natural Language Processing tasks. In the pervasive IoT era, their deployment on edge devices is more compelling than ever. However, their gigantic model footprint has hindered on-device learning applications which enable AI models to continuously learn and adapt to changes over time. Back-propagation, in use by the majority of deep learning frameworks, is computationally intensive and requires storing intermediate activations into memory to cope with the model's weights update. Recently, "Forward-only algorithms" have been proposed since they are biologically plausible alternatives. By applying more "forward" passes, this class of algorithms can achieve memory reductions with respect to more naive forward-only approaches and by removing the need to store intermediate activations. This comes at the expense of increased computational complexity. This paper considered three Large Language Model: DistilBERT, GPT-3 Small and AlexaTM. It investigated quantitatively any improvements about memory usage and computational complexity brought by known approaches named PEPITA and MEMPEPITA with respect to backpropagation. For low number of tokens in context, and depending on the model, PEPITA increases marginally or reduces substantially arithmetic operations. On the other hand, for large number of tokens in context, PEPITA reduces computational complexity by 30% to 50%. MEMPEPITA increases PEPITA's complexity by one third. About memory, PEPITA and backpropagation, require a comparable amount of memory to store activations, while MEMPEPITA reduces it by 50% to 94% with the benefits being more evident for architectures with a long sequence of blocks. In various real case scenarios, MEMPEPITA's memory reduction was essential for meeting the tight memory requirements of 128 MB equipped edge consumer devices, which are commonly available as smartphone and industrial application multi processors.

引用

页数：13

共 50 条

[1] Consumer segmentation with large language models
Li, Yinan
Liu, Ying
Yu, Muran
JOURNAL OF RETAILING AND CONSUMER SERVICES, 2025, 82
[2] Evaluating Large Language Learning Models' Accuracy and Reliability in Addressing Consumer Health Queries
Chung, Sunny
Koos, Jessica
JOURNAL OF CONSUMER HEALTH ON THE INTERNET, 2024, 28 (04) : 395 - 402
[3] The path forward for large language models in medicine is open
Riedemann, Lars
Labonne, Maxime
Gilbert, Stephen
NPJ DIGITAL MEDICINE, 2024, 7 (01):
[4] Large Language Models in Academic Plastic Surgery: The Way Forward
ElHawary, Hassan
Gorgy, Andrew
Janis, Jeffrey E.
PLASTIC AND RECONSTRUCTIVE SURGERY-GLOBAL OPEN, 2023, 11 (04)
[5] Federated and edge learning for large language models
Piccialli, Francesco
Chiaro, Diletta
Qi, Pian
Bellandi, Valerio
Damiani, Ernesto
INFORMATION FUSION, 2025, 117
[6] Tool learning with large language models: a survey
Qu, Changle
Dai, Sunhao
Wei, Xiaochi
Cai, Hengyi
Wang, Shuaiqiang
Yin, Dawei
Xu, Jun
Wen, Ji-rong
FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (08)
[7] An Investigation of Applying Large Language Models to Spoken Language Learning
Gao, Yingming
Nuchged, Baorian
Li, Ya
Peng, Linkai
APPLIED SCIENCES-BASEL, 2024, 14 (01):
[8] Large Language Models Demonstrate the Potential of Statistical Learning in Language
Contreras Kallens, Pablo
Kristensen-McLachlan, Ross Deans
Christiansen, Morten H.
COGNITIVE SCIENCE, 2023, 47 (03) : e13256
[9] Shortcut Learning of Large Language Models in Natural Language Understanding
Du, Mengnan
He, Fengxiang
Zou, Na
Tao, Dacheng
Hu, Xia
COMMUNICATIONS OF THE ACM, 2024, 67 (01) : 110 - 120
[10] Large Language Models on Mobile Devices: Measurements, Analysis, and Insights
Li, Xiang
Lu, Zhenyan
Cai, Dongqi
Ma, Xiao
Xu, Mengwei
PROCEEDINGS OF THE 2024 WORKSHOP ON EDGE AND MOBILE FOUNDATION MODELS, EDGEFM 2024, 2024, : 1 - 6

← 1 2 3 4 5 →