Stochastic Markov gradient descent and training low-bit neural networks

被引:0
|
作者
Ashbrock, Jonathan [1 ]
Powell, Alexander M. [2 ]
机构
[1] MITRE Corp, Mclean, VA 22102 USA
[2] Vanderbilt Univ, Dept Math, Nashville, TN 37240 USA
关键词
Neural networks; Quantization; Stochastic gradient descent; Stochastic Markov gradient descent; Low-memory training;
D O I
10.1007/s43670-021-00015-1
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The massive size of modern neural networks has motivated substantial recent interest in neural network quantization, especially low-bit quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] An Optimized Design Technique of Low-bit Neural Network Training for Personalization on IoT Devices
    Choi, Seungkyu
    Shin, Jaekang
    Choi, Yeongjae
    Kim, Lee-Sup
    PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
  • [32] Design Space Exploration of Low-Bit Quantized Neural Networks for Visual Place Recognition
    Grainge, Oliver
    Milford, Michael
    Bodala, Indu
    Ramchurn, Sarvapali D.
    Ehsan, Shoaib
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (06): : 5070 - 5077
  • [33] Training Neural Networks Using Predictor-Corrector Gradient Descent
    Nesky, Amy
    Stout, Quentin F.
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT III, 2018, 11141 : 62 - 72
  • [34] Training Morphological Neural Networks with Gradient Descent: Some Theoretical Insights
    Blusseau, Samy
    DISCRETE GEOMETRY AND MATHEMATICAL MORPHOLOGY, DGMM 2024, 2024, 14605 : 229 - 241
  • [35] A Solver plus Gradient Descent Training Algorithm for Deep Neural Networks
    Ashok, Dhananjay
    Nagisetty, Vineel
    Srinivasa, Christopher
    Ganesh, Vijay
    PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, 2022, : 1766 - 1773
  • [36] HSB-GDM: a Hybrid Stochastic-Binary Circuit for Gradient Descent with Momentum in the Training of Neural Networks
    Li, Han
    Shi, Heng
    Jiang, Honglan
    Liu, Siting
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL SYMPOSIUM ON NANOSCALE ARCHITECTURES, NANOARCH 2022, 2022,
  • [37] A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions
    Arnulf Jentzen
    Adrian Riekert
    Zeitschrift für angewandte Mathematik und Physik, 2022, 73
  • [38] A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions
    Jentzen, Arnulf
    Riekert, Adrian
    ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND PHYSIK, 2022, 73 (05):
  • [39] Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks
    Gong, Ruihao
    Liu, Xianglong
    Jiang, Shenghu
    Li, Tianxiang
    Hu, Peng
    Lin, Jiazhen
    Yu, Fengwei
    Yan, Junjie
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4851 - 4860
  • [40] On the Decentralized Stochastic Gradient Descent With Markov Chain Sampling
    Sun, Tao
    Li, Dongsheng
    Wang, Bao
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2023, 71 : 2895 - 2909