Layer-Wise Invertibility for Extreme Memory Cost Reduction of CNN Training

被引:3
|
作者
Hascoet, Tristan [1 ]
Febvre, Quentin [2 ]
Zhuang, Weihao [1 ]
Ariki, Yasuo [1 ]
Takiguchi, Tetusya [1 ]
机构
[1] Kobe Univ, Kobe, Hyogo, Japan
[2] Sicara, Paris, France
关键词
D O I
10.1109/ICCVW.2019.00258
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional Neural Networks (CNN) have demonstrated state-of-the-art results on various computer vision problems. However, training CNNs require specialized GPU with large memory. GPU memory has been a major bottleneck of the CNN training procedure, limiting the size of both inputs and model architectures. Given the ubiquity of CNN in computer vision, optimizing the memory consumption of CNN training would have wide spread practical benefits. Recently, reversible neural networks have been proposed to alleviate this memory bottleneck by recomputing hidden activations through inverse operations during the backward pass of the backpropagation algorithm. In this paper, we push this idea to extreme and design a reversible neural network with minimal training memory consumption. The result demonstrated that we can train CI-FAR10 dataset on Nvidia GTX750 GPU only with IGB memory and achieve 93% accuracy within 67 minutes.
引用
收藏
页码:2049 / 2052
页数:4
相关论文
共 50 条
  • [1] Reversible designs for extreme memory cost reduction of CNN training
    Hascoet, Tristan
    Febvre, Quentin
    Zhuang, Weihao
    Ariki, Yasuo
    Takiguchi, Tetsuya
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2023, 2023 (01)
  • [2] Reversible designs for extreme memory cost reduction of CNN training
    Tristan Hascoet
    Quentin Febvre
    Weihao Zhuang
    Yasuo Ariki
    Tetsuya Takiguchi
    EURASIP Journal on Image and Video Processing, 2023
  • [3] High-dimensional neural feature design for layer-wise reduction of training cost
    Javid, Alireza M.
    Venkitaraman, Arun
    Skoglund, Mikael
    Chatterjee, Saikat
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2020, 2020 (01)
  • [4] High-dimensional neural feature design for layer-wise reduction of training cost
    Alireza M. Javid
    Arun Venkitaraman
    Mikael Skoglund
    Saikat Chatterjee
    EURASIP Journal on Advances in Signal Processing, 2020
  • [5] GREEDY LAYER-WISE TRAINING OF LONG SHORT TERM MEMORY NETWORKS
    Xu, Kaisheng
    Shen, Xu
    Yao, Ting
    Tian, Xinmei
    Mei, Tao
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,
  • [6] Layer-Wise Data-Free CNN Compression
    Horton, Maxwell
    Jin, Yanzi
    Farhadi, Ali
    Rastegari, Mohammad
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2019 - 2026
  • [7] LCRM: Layer-Wise Complexity Reduction Method for CNN Model Optimization on End Devices
    Hussain, Hanan
    Tamizharasan, P. S.
    Yadav, Praveen Kumar
    IEEE ACCESS, 2023, 11 : 66838 - 66857
  • [8] SPSA for Layer-Wise Training of Deep Networks
    Wulff, Benjamin
    Schuecker, Jannis
    Bauckhage, Christian
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT III, 2018, 11141 : 564 - 573
  • [9] A Layer-Wise Extreme Network Compression for Super Resolution
    Hwang, Jiwon
    Uddin, A. F. M. Shahab
    Bae, Sung-Ho
    IEEE ACCESS, 2021, 9 : 93998 - 94009
  • [10] eXtreme Federated Learning (XFL): a layer-wise approach
    El Mokadem, Rachid
    Ben Maissa, Yann
    El Akkaoui, Zineb
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (05): : 5741 - 5754