Stochastic Markov gradient descent and training low-bit neural networks

被引：0

作者：

Ashbrock, Jonathan ^{[1
]}

Powell, Alexander M. ^{[2
]}

机构：

[1] MITRE Corp, Mclean, VA 22102 USA

[2] Vanderbilt Univ, Dept Math, Nashville, TN 37240 USA

来源：

SAMPLING THEORY SIGNAL PROCESSING AND DATA ANALYSIS | 2021年 / 19卷 / 02期

关键词：

Neural networks; Quantization; Stochastic gradient descent; Stochastic Markov gradient descent; Low-memory training;

D O I：

10.1007/s43670-021-00015-1

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

The massive size of modern neural networks has motivated substantial recent interest in neural network quantization, especially low-bit quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.

引用

页数：23

共 50 条

[21] Feature Map-Aware Activation Quantization for Low-bit Neural Networks
Lee, Seungjin
Kim, Hyun
2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,
[22] Explicit loss asymptotics in the gradient descent training of neural networks
Velikanov, Maksim
Yarotsky, Dmitry
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[23] Accelerating deep neural network training with inconsistent stochastic gradient descent
Wang, Linnan
Yang, Yi
Min, Renqiang
Chakradhar, Srimat
NEURAL NETWORKS, 2017, 93 : 219 - 229
[24] A TDA-based performance analysis for neural networks with low-bit weights
Ogio, Yugo
Tsubone, Naoki
Minami, Yuki
Ishikawa, Masato
ARTIFICIAL LIFE AND ROBOTICS, 2025,
[25] Enhancement artificial neural networks for low-bit rate speech compression system
Srinonchat, J.
2006 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES,VOLS 1-3, 2006, : 986 - 989
[26] Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks
Vasudevan, Shrihari
ENTROPY, 2020, 22 (05)
[27] Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation
Jentzen, Arnulf
Welti, Timo
APPLIED MATHEMATICS AND COMPUTATION, 2023, 455
[28] Optimizing Deep Neural Networks Through Neuroevolution With Stochastic Gradient Descent
Zhang, Haichao
Hao, Kuangrong
Gao, Lei
Wei, Bing
Tang, Xuesong
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2023, 15 (01) : 111 - 121
[29] Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
Cao, Yuan
Gu, Quanquan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[30] Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent
Whiting, Wes
Wang, Bao
Xin, Jack
COMMUNICATIONS ON APPLIED MATHEMATICS AND COMPUTATION, 2024, 6 (02) : 1175 - 1188

← 1 2 3 4 5 →