A Representer Theorem for Deep Neural Networks

被引:0
|
作者
Unser, Michael [1 ]
机构
[1] Ecole Polytech Fed Lausanne, Biomed Imaging Grp, CH-1015 Lausanne, Switzerland
基金
瑞士国家科学基金会;
关键词
splines; regularization; sparsity; learning; deep neural networks; activation functions; LINEAR INVERSE PROBLEMS; SPLINES; KERNELS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose to optimize the activation functions of a deep neural network by adding a corresponding functional regularization to the cost function. We justify the use of a second-order total-variation criterion. This allows us to derive a general representer theorem for deep neural networks that makes a direct connection with splines and sparsity. Specifically, we show that the optimal network configuration can be achieved with activation functions that are nonuniform linear splines with adaptive knots. The bottom line is that the action of each neuron is encoded by a spline whose parameters (including the number of knots) are optimized during the training procedure. The scheme results in a computational structure that is compatible with existing deep-ReLU, parametric ReLU, APL (adaptive piecewise-linear) and MaxOut architectures. It also suggests novel optimization challenges and makes an explicit link with l(1) minimization and sparsity-promoting techniques.
引用
收藏
页数:30
相关论文
共 50 条
  • [41] DEEP SCATTERING SPECTRUM WITH DEEP NEURAL NETWORKS
    Peddinti, Vijayaditya
    Sainath, Tara N.
    Maymon, Shay
    Ramabhadran, Bhuvana
    Nahamoo, David
    Goel, Vaibhava
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [42] TDSNN: From Deep Neural Networks to Deep Spike Neural Networks with Temporal-Coding
    Zhang, Lei
    Zhou, Shengyuan
    Zhi, Tian
    Du, Zidong
    Chen, Yunji
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 1319 - 1326
  • [43] Shallow Neural Networks to Deep Neural Networks for Probabilistic Wind Forecasting
    Arora, Parul
    Panigrahi, B. K.
    Suganthan, P. N.
    2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, AND INTELLIGENT SYSTEMS (ICCCIS), 2021, : 377 - 382
  • [44] Deep Learning Neural Networks and Bayesian Neural Networks in Data Analysis
    Chernoded, Andrey
    Dudko, Lev
    Myagkov, Igor
    Volkov, Petr
    XXIII INTERNATIONAL WORKSHOP HIGH ENERGY PHYSICS AND QUANTUM FIELD THEORY (QFTHEP 2017), 2017, 158
  • [45] Learning Functions Using Data-Dependent Regularization: Representer Theorem Revisited
    Zou, Qing
    COMPUTATIONAL SCIENCE - ICCS 2020, PT III, 2020, 12139 : 312 - 326
  • [46] Deep Forest: Towards an Alternative to Deep Neural Networks
    Zhou, Zhi-Hua
    Feng, Ji
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3553 - 3559
  • [47] Architecture of neural processing unit for deep neural networks
    Lee, Kyuho J.
    HARDWARE ACCELERATOR SYSTEMS FOR ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, 2021, 122 : 217 - 245
  • [48] Dynamics of Deep Neural Networks and Neural Tangent Hierarchy
    Huang, Jiaoyang
    Yau, Horng-Tzer
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [49] Learning in neural networks based on a generalized fluctuation theorem
    Hayakawa, Takashi
    Aoyagi, Toshio
    PHYSICAL REVIEW E, 2015, 92 (05):
  • [50] A Dynamical Central Limit Theorem for Shallow Neural Networks
    Chen, Zhengdao
    Rotskoff, Grant M.
    Bruna, Joan
    Vanden-Eijnden, Eric
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33