Layer-Wise Invertibility for Extreme Memory Cost Reduction of CNN Training

被引：3

作者：

Hascoet, Tristan ^{[1
]}

Febvre, Quentin ^{[2
]}

Zhuang, Weihao ^{[1
]}

Ariki, Yasuo ^{[1
]}

Takiguchi, Tetusya ^{[1
]}

机构：

[1] Kobe Univ, Kobe, Hyogo, Japan

[2] Sicara, Paris, France

来源：

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW) | 2019年

关键词：

D O I：

10.1109/ICCVW.2019.00258

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional Neural Networks (CNN) have demonstrated state-of-the-art results on various computer vision problems. However, training CNNs require specialized GPU with large memory. GPU memory has been a major bottleneck of the CNN training procedure, limiting the size of both inputs and model architectures. Given the ubiquity of CNN in computer vision, optimizing the memory consumption of CNN training would have wide spread practical benefits. Recently, reversible neural networks have been proposed to alleviate this memory bottleneck by recomputing hidden activations through inverse operations during the backward pass of the backpropagation algorithm. In this paper, we push this idea to extreme and design a reversible neural network with minimal training memory consumption. The result demonstrated that we can train CI-FAR10 dataset on Nvidia GTX750 GPU only with IGB memory and achieve 93% accuracy within 67 minutes.

引用

页码：2049 / 2052

页数：4

共 50 条

[31] Filtering-based Layer-wise Parameter Update Method for Training a Neural Network
Ji, Siyu
Zhai, Kaikai
Wen, Chenglin
2018 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES (ICCAIS), 2018, : 389 - 394
[32] Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory
Ha, Donghee
Kim, Mooseop
Moon, KyeongDeok
Jeong, Chi Yoon
SENSORS, 2021, 21 (07)
[33] SLO-Aware Function Placement for Serverless Workflows With Layer-Wise Memory Sharing
Cheng, Dazhao
Yan, Kai
Cai, Xinquan
Gong, Yili
Hu, Chuang
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (06) : 919 - 936
[34] Potential Layer-Wise Supervised Learning for Training Multi-Layered Neural Networks
Kamimura, Ryotaro
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 2568 - 2575
[35] Push-pull separability objective for supervised layer-wise training of neural networks
Szymanski, Lech
McCane, Brendan
2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
[36] LOT: Layer-wise Orthogonal Training on Improving l2 Certified Robustness
Xu, Xiaojun
Li, Linyi
Li, Bo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[37] Post-training deep neural network pruning via layer-wise calibration
Lazarevich, Ivan
Kozlov, Alexander
Malinin, Nikita
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 798 - 805
[38] Layer-Wise De-Training and Re-Training for ConvS2S Machine Translation
Yu, Hongfei
Zhou, Xiaoqing
Duan, Xiangyu
Zhang, Min
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (02)
[39] GLNAS: Greedy Layer-wise Network Architecture Search for low cost and fast network generation
Ho, Jiacang
Park, Kyongseok
Kang, Dae-Ki
PATTERN RECOGNITION, 2024, 155
[40] LSMQ: A Layer-Wise Sensitivity-Based MixedPrecision Quantization Method for Bit-Flexible CNN Accelerator
Huang, Yimin
Chen, Kai
Shao, Zhuang
Bai, Yichuan
Huang, Yafeng
Du, Yuan
Du, Li
Wang, Zhongfeng
18TH INTERNATIONAL SOC DESIGN CONFERENCE 2021 (ISOCC 2021), 2021, : 256 - 257

← 1 2 3 4 5 →