Layer-Wise Invertibility for Extreme Memory Cost Reduction of CNN Training

被引：3

作者：

Hascoet, Tristan ^{[1
]}

Febvre, Quentin ^{[2
]}

Zhuang, Weihao ^{[1
]}

Ariki, Yasuo ^{[1
]}

Takiguchi, Tetusya ^{[1
]}

机构：

[1] Kobe Univ, Kobe, Hyogo, Japan

[2] Sicara, Paris, France

来源：

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW) | 2019年

关键词：

D O I：

10.1109/ICCVW.2019.00258

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional Neural Networks (CNN) have demonstrated state-of-the-art results on various computer vision problems. However, training CNNs require specialized GPU with large memory. GPU memory has been a major bottleneck of the CNN training procedure, limiting the size of both inputs and model architectures. Given the ubiquity of CNN in computer vision, optimizing the memory consumption of CNN training would have wide spread practical benefits. Recently, reversible neural networks have been proposed to alleviate this memory bottleneck by recomputing hidden activations through inverse operations during the backward pass of the backpropagation algorithm. In this paper, we push this idea to extreme and design a reversible neural network with minimal training memory consumption. The result demonstrated that we can train CI-FAR10 dataset on Nvidia GTX750 GPU only with IGB memory and achieve 93% accuracy within 67 minutes.

引用

页码：2049 / 2052

页数：4

共 50 条

[21] Live Demonstration: Layer-wise Configurable CNN Accelerator with High PE Utilization
Park, Chunmyung
Hyun, Eunjae
Kim, Jicheon
Nguyen, Xuan Truong
Lee, Hyuk-Jae
2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
[22] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
Zhou, Yefan
Pang, Tianyu
Liu, Keqin
Martin, Charles H.
Mahoney, Michael W.
Yang, Yaoqing
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[23] Using Layer-Wise Training for Road Semantic Segmentation in Autonomous Cars
Shashaani, Shahrzad
Teshnehlab, Mohammad
Khodadadian, Amirreza
Parvizi, Maryam
Wick, Thomas
Noii, Nima
IEEE ACCESS, 2023, 11 : 46320 - 46329
[24] Layer-Wise Sparse Training of Transformer via Convolutional Flood Filling
Yoon, Bokyeong
Han, Yoonsang
Moon, Gordon Euhyun
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II, PAKDD 2024, 2024, 14646 : 158 - 170
[25] Dataflow optimization with layer-wise design variables estimation method for enflame CNN accelerators
Chen, Tian
Tan, Yu -an
Zhang, Zheng
Luo, Nan
Li, Bin
Li, Yuanzhang
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 189
[26] Superpixel-Guided Layer-Wise Embedding CNN for Remote Sensing Image Classification
Liu, Han
Li, Jun
He, Lin
Wang, Yu
REMOTE SENSING, 2019, 11 (02)
[27] Layer-wise regularized adversarial training using layers sustainability analysis framework
Khalooei, Mohammad
Homayounpour, Mohammad Mehdi
Amirmazlaghani, Maryam
NEUROCOMPUTING, 2023, 540
[28] Voice Conversion Using Deep Neural Networks With Layer-Wise Generative Training
Chen, Ling-Hui
Ling, Zhen-Hua
Liu, Li-Juan
Dai, Li-Rong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 1859 - 1872
[29] Supervised Greedy Layer-Wise Training for Deep Convolutional Networks with Small Datasets
Rueda-Plata, Diego
Ramos-Pollan, Raul
Gonzalez, Fabio A.
COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT I, 2015, 9329 : 275 - 284
[30] Layer-wise Pre-training Mechanism Based on Neural Network for Epilepsy Detection
Lin, Zichao
Gu, Zhenghui
Li, Yinghao
Yu, Zhuliang
Li, Yuanqing
2020 12TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2020, : 224 - 227

← 1 2 3 4 5 →