Layer-Wise Invertibility for Extreme Memory Cost Reduction of CNN Training

被引：3

作者：

Hascoet, Tristan ^{[1
]}

Febvre, Quentin ^{[2
]}

Zhuang, Weihao ^{[1
]}

Ariki, Yasuo ^{[1
]}

Takiguchi, Tetusya ^{[1
]}

机构：

[1] Kobe Univ, Kobe, Hyogo, Japan

[2] Sicara, Paris, France

来源：

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW) | 2019年

关键词：

D O I：

10.1109/ICCVW.2019.00258

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional Neural Networks (CNN) have demonstrated state-of-the-art results on various computer vision problems. However, training CNNs require specialized GPU with large memory. GPU memory has been a major bottleneck of the CNN training procedure, limiting the size of both inputs and model architectures. Given the ubiquity of CNN in computer vision, optimizing the memory consumption of CNN training would have wide spread practical benefits. Recently, reversible neural networks have been proposed to alleviate this memory bottleneck by recomputing hidden activations through inverse operations during the backward pass of the backpropagation algorithm. In this paper, we push this idea to extreme and design a reversible neural network with minimal training memory consumption. The result demonstrated that we can train CI-FAR10 dataset on Nvidia GTX750 GPU only with IGB memory and achieve 93% accuracy within 67 minutes.

引用

页码：2049 / 2052

页数：4

共 50 条

[1] Reversible designs for extreme memory cost reduction of CNN training
Hascoet, Tristan
Febvre, Quentin
Zhuang, Weihao
Ariki, Yasuo
Takiguchi, Tetsuya
EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2023, 2023 (01)
[2] Reversible designs for extreme memory cost reduction of CNN training
Tristan Hascoet
Quentin Febvre
Weihao Zhuang
Yasuo Ariki
Tetsuya Takiguchi
EURASIP Journal on Image and Video Processing, 2023
[3] High-dimensional neural feature design for layer-wise reduction of training cost
Javid, Alireza M.
Venkitaraman, Arun
Skoglund, Mikael
Chatterjee, Saikat
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2020, 2020 (01)
[4] High-dimensional neural feature design for layer-wise reduction of training cost
Alireza M. Javid
Arun Venkitaraman
Mikael Skoglund
Saikat Chatterjee
EURASIP Journal on Advances in Signal Processing, 2020
[5] GREEDY LAYER-WISE TRAINING OF LONG SHORT TERM MEMORY NETWORKS
Xu, Kaisheng
Shen, Xu
Yao, Ting
Tian, Xinmei
Mei, Tao
2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,
[6] Layer-Wise Data-Free CNN Compression
Horton, Maxwell
Jin, Yanzi
Farhadi, Ali
Rastegari, Mohammad
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2019 - 2026
[7] LCRM: Layer-Wise Complexity Reduction Method for CNN Model Optimization on End Devices
Hussain, Hanan
Tamizharasan, P. S.
Yadav, Praveen Kumar
IEEE ACCESS, 2023, 11 : 66838 - 66857
[8] SPSA for Layer-Wise Training of Deep Networks
Wulff, Benjamin
Schuecker, Jannis
Bauckhage, Christian
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT III, 2018, 11141 : 564 - 573
[9] A Layer-Wise Extreme Network Compression for Super Resolution
Hwang, Jiwon
Uddin, A. F. M. Shahab
Bae, Sung-Ho
IEEE ACCESS, 2021, 9 : 93998 - 94009
[10] eXtreme Federated Learning (XFL): a layer-wise approach
El Mokadem, Rachid
Ben Maissa, Yann
El Akkaoui, Zineb
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (05): : 5741 - 5754

← 1 2 3 4 5 →