Sparsity-aware generalization theory for deep neural networks

被引:0
|
作者
Muthukumar, Ramchandran [1 ,2 ]
Sulam, Jeremias [2 ,3 ]
机构
[1] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Math Inst Data Sci, Baltimore, MD 21218 USA
[3] Johns Hopkins Univ, Dept Biomed Engn, Baltimore, MD 21218 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep artificial neural networks achieve surprising generalization abilities that remain poorly understood. In this paper, we present a new approach to analyzing generalization for deep feed-forward ReLU networks that takes advantage of the degree of sparsity that is achieved in the hidden layer activations. By developing a framework that accounts for this reduced effective model size for each input sample, we are able to show fundamental trade-offs between sparsity and generalization. Importantly, our results make no strong assumptions about the degree of sparsity achieved by the model, and it improves over recent norm-based approaches. We illustrate our results numerically, demonstrating non-vacuous bounds when coupled with data-dependent priors in specific settings, even in over-parametrized models.
引用
收藏
页数:32
相关论文
共 50 条
  • [1] Sparsity-Aware Caches to Accelerate Deep Neural Networks
    Ganesan, Vinod
    Sen, Sanchari
    Kumar, Pratyush
    Gala, Neel
    Veezhinathan, Kamakoti
    Raghunathan, Anand
    PROCEEDINGS OF THE 2020 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2020), 2020, : 85 - 90
  • [2] Sparsity-Aware Orthogonal Initialization of Deep Neural Networks
    Esguerra, Kiara
    Nasir, Muneeb
    Tang, Tong Boon
    Tumian, Afidalina
    Ho, Eric Tatt Wei
    IEEE ACCESS, 2023, 11 : 74165 - 74181
  • [3] Parallax: Sparsity-aware Data Parallel Training of Deep Neural Networks
    Kim, Soojeong
    Yu, Gyeong-In
    Park, Hojin
    Cho, Sungwoo
    Jeong, Eunji
    Ha, Hyeonmin
    Lee, Sanha
    Jeong, Joo Seong
    Chun, Byung-Gon
    PROCEEDINGS OF THE FOURTEENTH EUROSYS CONFERENCE 2019 (EUROSYS '19), 2019,
  • [4] SATA: Sparsity-Aware Training Accelerator for Spiking Neural Networks
    Yin, Ruokai
    Moitra, Abhishek
    Bhattacharjee, Abhiroop
    Kim, Youngeun
    Panda, Priyadarshini
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (06) : 1926 - 1938
  • [5] A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks
    Haiying Yuan
    Zhiyong Zeng
    Junpeng Cheng
    Minghao Li
    Circuits, Systems, and Signal Processing, 2022, 41 : 4370 - 4389
  • [6] A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks
    Yuan, Haiying
    Zeng, Zhiyong
    Cheng, Junpeng
    Li, Minghao
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (08) : 4370 - 4389
  • [7] Sparsity-Aware Tensor Decomposition
    Kurt, Sureyya Emre
    Raje, Saurabh
    Sukumaran-Rajam, Aravind
    Sadayappan, P.
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 952 - 962
  • [8] Sparsity-Aware Communication for Distributed Graph Neural Network Training
    Mukhopadhyay, Ujjaini
    Tripathy, Alok
    Selvitopi, Oguz
    Yelick, Katherine
    Buluc, Aydin
    53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 117 - 126
  • [9] A Sparsity-Aware Convolutional Neural Network Accelerator with Flexible Parallelism
    Yuan H.-Y.
    Zeng Z.-Y.
    Cheng J.-P.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (08): : 1811 - 1818
  • [10] SpeqNets: Sparsity-aware Permutation-equivariant Graph Networks
    Morris, Christopher
    Rattan, Gaurav
    Kiefer, Sandra
    Ravanbkash, Siamak
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,