Layer-Wise Invertibility for Extreme Memory Cost Reduction of CNN Training

被引:3
|
作者
Hascoet, Tristan [1 ]
Febvre, Quentin [2 ]
Zhuang, Weihao [1 ]
Ariki, Yasuo [1 ]
Takiguchi, Tetusya [1 ]
机构
[1] Kobe Univ, Kobe, Hyogo, Japan
[2] Sicara, Paris, France
关键词
D O I
10.1109/ICCVW.2019.00258
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional Neural Networks (CNN) have demonstrated state-of-the-art results on various computer vision problems. However, training CNNs require specialized GPU with large memory. GPU memory has been a major bottleneck of the CNN training procedure, limiting the size of both inputs and model architectures. Given the ubiquity of CNN in computer vision, optimizing the memory consumption of CNN training would have wide spread practical benefits. Recently, reversible neural networks have been proposed to alleviate this memory bottleneck by recomputing hidden activations through inverse operations during the backward pass of the backpropagation algorithm. In this paper, we push this idea to extreme and design a reversible neural network with minimal training memory consumption. The result demonstrated that we can train CI-FAR10 dataset on Nvidia GTX750 GPU only with IGB memory and achieve 93% accuracy within 67 minutes.
引用
收藏
页码:2049 / 2052
页数:4
相关论文
共 50 条
  • [31] Filtering-based Layer-wise Parameter Update Method for Training a Neural Network
    Ji, Siyu
    Zhai, Kaikai
    Wen, Chenglin
    2018 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES (ICCAIS), 2018, : 389 - 394
  • [32] Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory
    Ha, Donghee
    Kim, Mooseop
    Moon, KyeongDeok
    Jeong, Chi Yoon
    SENSORS, 2021, 21 (07)
  • [33] SLO-Aware Function Placement for Serverless Workflows With Layer-Wise Memory Sharing
    Cheng, Dazhao
    Yan, Kai
    Cai, Xinquan
    Gong, Yili
    Hu, Chuang
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (06) : 919 - 936
  • [34] Potential Layer-Wise Supervised Learning for Training Multi-Layered Neural Networks
    Kamimura, Ryotaro
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 2568 - 2575
  • [35] Push-pull separability objective for supervised layer-wise training of neural networks
    Szymanski, Lech
    McCane, Brendan
    2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [36] LOT: Layer-wise Orthogonal Training on Improving l2 Certified Robustness
    Xu, Xiaojun
    Li, Linyi
    Li, Bo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [37] Post-training deep neural network pruning via layer-wise calibration
    Lazarevich, Ivan
    Kozlov, Alexander
    Malinin, Nikita
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 798 - 805
  • [38] Layer-Wise De-Training and Re-Training for ConvS2S Machine Translation
    Yu, Hongfei
    Zhou, Xiaoqing
    Duan, Xiangyu
    Zhang, Min
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (02)
  • [39] GLNAS: Greedy Layer-wise Network Architecture Search for low cost and fast network generation
    Ho, Jiacang
    Park, Kyongseok
    Kang, Dae-Ki
    PATTERN RECOGNITION, 2024, 155
  • [40] LSMQ: A Layer-Wise Sensitivity-Based MixedPrecision Quantization Method for Bit-Flexible CNN Accelerator
    Huang, Yimin
    Chen, Kai
    Shao, Zhuang
    Bai, Yichuan
    Huang, Yafeng
    Du, Yuan
    Du, Li
    Wang, Zhongfeng
    18TH INTERNATIONAL SOC DESIGN CONFERENCE 2021 (ISOCC 2021), 2021, : 256 - 257