Stochastic Markov gradient descent and training low-bit neural networks

被引:0
|
作者
Ashbrock, Jonathan [1 ]
Powell, Alexander M. [2 ]
机构
[1] MITRE Corp, Mclean, VA 22102 USA
[2] Vanderbilt Univ, Dept Math, Nashville, TN 37240 USA
关键词
Neural networks; Quantization; Stochastic gradient descent; Stochastic Markov gradient descent; Low-memory training;
D O I
10.1007/s43670-021-00015-1
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The massive size of modern neural networks has motivated substantial recent interest in neural network quantization, especially low-bit quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Feature Map-Aware Activation Quantization for Low-bit Neural Networks
    Lee, Seungjin
    Kim, Hyun
    2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,
  • [22] Explicit loss asymptotics in the gradient descent training of neural networks
    Velikanov, Maksim
    Yarotsky, Dmitry
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [23] Accelerating deep neural network training with inconsistent stochastic gradient descent
    Wang, Linnan
    Yang, Yi
    Min, Renqiang
    Chakradhar, Srimat
    NEURAL NETWORKS, 2017, 93 : 219 - 229
  • [24] A TDA-based performance analysis for neural networks with low-bit weights
    Ogio, Yugo
    Tsubone, Naoki
    Minami, Yuki
    Ishikawa, Masato
    ARTIFICIAL LIFE AND ROBOTICS, 2025,
  • [25] Enhancement artificial neural networks for low-bit rate speech compression system
    Srinonchat, J.
    2006 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES,VOLS 1-3, 2006, : 986 - 989
  • [26] Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks
    Vasudevan, Shrihari
    ENTROPY, 2020, 22 (05)
  • [27] Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation
    Jentzen, Arnulf
    Welti, Timo
    APPLIED MATHEMATICS AND COMPUTATION, 2023, 455
  • [28] Optimizing Deep Neural Networks Through Neuroevolution With Stochastic Gradient Descent
    Zhang, Haichao
    Hao, Kuangrong
    Gao, Lei
    Wei, Bing
    Tang, Xuesong
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2023, 15 (01) : 111 - 121
  • [29] Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
    Cao, Yuan
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [30] Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent
    Whiting, Wes
    Wang, Bao
    Xin, Jack
    COMMUNICATIONS ON APPLIED MATHEMATICS AND COMPUTATION, 2024, 6 (02) : 1175 - 1188