Exploring Strategies for Training Deep Neural Networks

被引：0

作者：

Larochelle, Hugo ^{[1
]}

Bengio, Yoshua ^{[1
]}

Louradour, Jerome ^{[1
]}

Lamblin, Pascal ^{[1
]}

机构：

[1] Univ Montreal, Dept Informat & Rech Operat, Montreal, PQ H3T 1J8, Canada

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2009年 / 10卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

artificial neural networks; deep belief networks; restricted Boltzmann machines; autoassociators; unsupervised learning; COMPONENT ANALYSIS; BLIND SEPARATION; DIMENSIONALITY; ALGORITHM;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly non-linear and highly-varying functions. However, until recently it was not clear how to train such deep networks, since gradient-based optimization starting from random initialization often appears to get stuck in poor solutions. Hinton et al. recently proposed a greedy layer-wise unsupervised learning procedure relying on the training algorithm of restricted Boltzmann machines (RBM) to initialize the parameters of a deep belief network (DBN), a generative model with many layers of hidden causal variables. This was followed by the proposal of another greedy layer-wise procedure, relying on the usage of autoassociator networks. In the context of the above optimization problem, we study these algorithms empirically to better understand their success. Our experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy helps the optimization by initializing weights in a region near a good local minimum, but also implicitly acts as a sort of regularization that brings better generalization and encourages internal distributed representations that are high-level abstractions of the input. We also present a series of experiments aimed at evaluating the link between the performance of deep neural networks and practical aspects of their topology, for example, demonstrating cases where the addition of more depth helps. Finally, we empirically explore simple variants of these training algorithms, such as the use of different RBM input unit distributions, a simple way of combining gradient estimators to improve performance, as well as on-line versions of those algorithms.

引用

页码：1 / 40

页数：40

共 50 条

[1] Exploring strategies for training deep neural networks
Larochelle, Hugo
Bengio, Yoshua
Louradour, Jérôme
Lamblin, Pascal
Journal of Machine Learning Research, 2009, 10 : 1 - 40
[2] Exploring Learning Strategies for Training Deep Neural Networks Using Multiple Graphics Processing Units
Hu, Nien-Tsu
Huang, Ching-Chien
Mo, Chih-Chieh
Huang, Chien-Lin
SENSORS AND MATERIALS, 2024, 36 (09) : 3743 - 3755
[3] SEMI-SUPERVISED TRAINING STRATEGIES FOR DEEP NEURAL NETWORKS
Gibson, Matthew
Cook, Gary
Zhan, Puming
2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 77 - 83
[4] Strategies for training optical neural networks
Qipeng Yang
Bowen Bai
Weiwei Hu
Xingjun Wang
National Science Open, 2022, 1 (03) : 7 - 11
[5] MULTILINGUAL TRAINING OF DEEP NEURAL NETWORKS
Ghoshal, Arnab
Swietojanski, Pawel
Renals, Steve
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7319 - 7323
[6] Training deep quantum neural networks
Beer, Kerstin
Bondarenko, Dmytro
Farrelly, Terry
Osborne, Tobias J.
Salzmann, Robert
Scheiermann, Daniel
Wolf, Ramona
NATURE COMMUNICATIONS, 2020, 11 (01)
[7] Training deep quantum neural networks
Kerstin Beer
Dmytro Bondarenko
Terry Farrelly
Tobias J. Osborne
Robert Salzmann
Daniel Scheiermann
Ramona Wolf
Nature Communications, 11
[8] NOISY TRAINING FOR DEEP NEURAL NETWORKS
Meng, Xiangtao
Liu, Chao
Zhang, Zhiyong
Wang, Dong
2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 16 - 20
[9] Exploring the Fundamentals of Mutations in Deep Neural Networks
Ahmed, Zaheed
Makedonski, Philip
ACM/IEEE 27TH INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS: COMPANION PROCEEDINGS, MODELS 2024, 2024, : 227 - 233
[10] Exploring deep neural networks for rumor detection
Muhammad Zubair Asghar
Ammara Habib
Anam Habib
Adil Khan
Rehman Ali
Asad Khattak
Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 4315 - 4333

← 1 2 3 4 5 →