A Representer Theorem for Deep Neural Networks

被引：0

作者：

Unser, Michael ^{[1
]}

机构：

[1] Ecole Polytech Fed Lausanne, Biomed Imaging Grp, CH-1015 Lausanne, Switzerland

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2019年 / 20卷

基金：

瑞士国家科学基金会;

关键词：

splines; regularization; sparsity; learning; deep neural networks; activation functions; LINEAR INVERSE PROBLEMS; SPLINES; KERNELS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We propose to optimize the activation functions of a deep neural network by adding a corresponding functional regularization to the cost function. We justify the use of a second-order total-variation criterion. This allows us to derive a general representer theorem for deep neural networks that makes a direct connection with splines and sparsity. Specifically, we show that the optimal network configuration can be achieved with activation functions that are nonuniform linear splines with adaptive knots. The bottom line is that the action of each neuron is encoded by a spline whose parameters (including the number of knots) are optimized during the training procedure. The scheme results in a computational structure that is compatible with existing deep-ReLU, parametric ReLU, APL (adaptive piecewise-linear) and MaxOut architectures. It also suggests novel optimization challenges and makes an explicit link with l(1) minimization and sparsity-promoting techniques.

引用

页数：30

共 50 条

[41] DEEP SCATTERING SPECTRUM WITH DEEP NEURAL NETWORKS
Peddinti, Vijayaditya
Sainath, Tara N.
Maymon, Shay
Ramabhadran, Bhuvana
Nahamoo, David
Goel, Vaibhava
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[42] TDSNN: From Deep Neural Networks to Deep Spike Neural Networks with Temporal-Coding
Zhang, Lei
Zhou, Shengyuan
Zhi, Tian
Du, Zidong
Chen, Yunji
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 1319 - 1326
[43] Shallow Neural Networks to Deep Neural Networks for Probabilistic Wind Forecasting
Arora, Parul
Panigrahi, B. K.
Suganthan, P. N.
2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, AND INTELLIGENT SYSTEMS (ICCCIS), 2021, : 377 - 382
[44] Deep Learning Neural Networks and Bayesian Neural Networks in Data Analysis
Chernoded, Andrey
Dudko, Lev
Myagkov, Igor
Volkov, Petr
XXIII INTERNATIONAL WORKSHOP HIGH ENERGY PHYSICS AND QUANTUM FIELD THEORY (QFTHEP 2017), 2017, 158
[45] Learning Functions Using Data-Dependent Regularization: Representer Theorem Revisited
Zou, Qing
COMPUTATIONAL SCIENCE - ICCS 2020, PT III, 2020, 12139 : 312 - 326
[46] Deep Forest: Towards an Alternative to Deep Neural Networks
Zhou, Zhi-Hua
Feng, Ji
PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3553 - 3559
[47] Architecture of neural processing unit for deep neural networks
Lee, Kyuho J.
HARDWARE ACCELERATOR SYSTEMS FOR ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, 2021, 122 : 217 - 245
[48] Dynamics of Deep Neural Networks and Neural Tangent Hierarchy
Huang, Jiaoyang
Yau, Horng-Tzer
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[49] Learning in neural networks based on a generalized fluctuation theorem
Hayakawa, Takashi
Aoyagi, Toshio
PHYSICAL REVIEW E, 2015, 92 (05):
[50] A Dynamical Central Limit Theorem for Shallow Neural Networks
Chen, Zhengdao
Rotskoff, Grant M.
Bruna, Joan
Vanden-Eijnden, Eric
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33

← 1 2 3 4 5 →