Stochastic Markov gradient descent and training low-bit neural networks

被引:0
|
作者
Ashbrock, Jonathan [1 ]
Powell, Alexander M. [2 ]
机构
[1] MITRE Corp, Mclean, VA 22102 USA
[2] Vanderbilt Univ, Dept Math, Nashville, TN 37240 USA
关键词
Neural networks; Quantization; Stochastic gradient descent; Stochastic Markov gradient descent; Low-memory training;
D O I
10.1007/s43670-021-00015-1
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The massive size of modern neural networks has motivated substantial recent interest in neural network quantization, especially low-bit quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Exploring the Potential of Low-Bit Training of Convolutional Neural Networks
    Zhong, Kai
    Ning, Xuefei
    Dai, Guohao
    Zhu, Zhenhua
    Zhao, Tianchen
    Zeng, Shulin
    Wang, Yu
    Yang, Huazhong
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (12) : 5421 - 5434
  • [2] Stochastic Quantization for Learning Accurate Low-Bit Deep Neural Networks
    Dong, Yinpeng
    Ni, Renkun
    Li, Jianguo
    Chen, Yurong
    Su, Hang
    Zhu, Jun
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (11-12) : 1629 - 1642
  • [3] Stochastic Quantization for Learning Accurate Low-Bit Deep Neural Networks
    Yinpeng Dong
    Renkun Ni
    Jianguo Li
    Yurong Chen
    Hang Su
    Jun Zhu
    International Journal of Computer Vision, 2019, 127 : 1629 - 1642
  • [4] Damped Newton Stochastic Gradient Descent Method for Neural Networks Training
    Zhou, Jingcheng
    Wei, Wei
    Zhang, Ruizhi
    Zheng, Zhiming
    MATHEMATICS, 2021, 9 (13)
  • [5] Low-bit Quantization of Neural Networks for Efficient Inference
    Choukroun, Yoni
    Kravchik, Eli
    Yang, Fan
    Kisilev, Pavel
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3009 - 3018
  • [6] Searching for Low-Bit Weights in Quantized Neural Networks
    Yang, Zhaohui
    Wang, Yunhe
    Han, Kai
    Xu, Chunjing
    Xu, Chao
    Tao, Dacheng
    Xu, Chang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [7] ESRU: Extremely Low-Bit and Hardware-Efficient Stochastic Rounding Unit Design for Low-Bit DNN Training
    Chang, Sung-En
    Yuan, Geng
    Lu, Alec
    Sun, Mengshu
    Li, Yanyu
    Ma, Xiaolong
    Li, Zhengang
    Xie, Yanyue
    Qin, Minghai
    Lin, Xue
    Fang, Zhenman
    Wang, Yanzhi
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [8] Non-convergence of stochastic gradient descent in the training of deep neural networks
    Cheridito, Patrick
    Jentzen, Arnulf
    Rossmannek, Florian
    JOURNAL OF COMPLEXITY, 2021, 64
  • [9] Natural Gradient Descent for Training Stochastic Complex-Valued Neural Networks
    Nitta, Tohru
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (07) : 193 - 198
  • [10] Implicit Stochastic Gradient Descent for Training Physics-Informed Neural Networks
    Li, Ye
    Chen, Song-Can
    Huang, Sheng-Jun
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8692 - 8700