Theoretical Analysis of Inductive Biases in Deep Convolutional Networks

被引：0

作者：

Wang, Zihao ^{[1
]}

Wu, Lei ^{[1
]}

机构：

[1] Peking Univ, Beijing, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

基金：

国家重点研发计划;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we provide a theoretical analysis of the inductive biases in convolutional neural networks (CNNs). We start by examining the universality of CNNs, i.e., the ability to approximate any continuous functions. We prove that a depth of O(log d) suffices for deep CNNs to achieve this universality, where d in the input dimension. Additionally, we establish that learning sparse functions with CNNs requires only (O) over tilde (log(2) d) samples, indicating that deep CNNs can efficiently capture long-range sparse correlations. These results are made possible through a novel combination of the multichanneling and downsampling when increasing the network depth. We also delve into the distinct roles of weight sharing and locality in CNNs. To this end, we compare the performance of CNNs, locally-connected networks (LCNs), and fully-connected networks (FCNs) on a simple regression task, where LCNs can be viewed as CNNs without weight sharing. On the one hand, we prove that LCNs require Omega(d) samples while CNNs need only (O) over tilde (log(2) d) samples, highlighting the critical role of weight sharing. On the other hand, we prove that FCNs require Omega(d(2)) samples, whereas LCNs need only (O) over tilde (d) samples, underscoring the importance of locality. These provable separations quantify the difference between the two biases, and the major observation behind our proof is that weight sharing and locality break different symmetries in the learning process.

引用

页数：50

共 50 条

[31] Deep learning networks for olive cultivar identification: A comprehensive analysis of convolutional neural networks
Mendes, Joao
Lima, Jose
Costa, Lino
Rodrigues, Nuno
Pereira, Ana I.
SMART AGRICULTURAL TECHNOLOGY, 2024, 8
[32] Chart Classification By Combining Deep Convolutional Networks and Deep Belief Networks
Liu, Xiao
Tang, Binbin
Wang, Zhenyang
Xu, Xianghua
Pu, Shiliang
Tao, Dapeng
Song, Mingli
2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 801 - 805
[33] Deep Convolutional Neural Networks with Transfer Learning for Visual Sentiment Analysis
Devi, K. Usha Kingsly
Gomathi, V
NEURAL PROCESSING LETTERS, 2023, 55 (04) : 5087 - 5120
[34] Progressive principle component analysis for compressing deep convolutional neural networks
Zhou, Jing
Qi, Haobo
Chen, Yu
Wang, Hansheng
NEUROCOMPUTING, 2021, 440 : 197 - 206
[35] Robust affect analysis using committee of deep convolutional neural networks
Newlin Shebiah Russel
Arivazhagan Selvaraj
Neural Computing and Applications, 2022, 34 : 3633 - 3645
[36] Automated Analysis of Microscopy Images using Deep Convolutional Neural Networks
Banadaki, Yaser
Okunoye, Adetayo
Batra, Sanjay
Martinez, Eduardo
Bai, Shuju
Sharifi, Safura
HEALTH MONITORING OF STRUCTURAL AND BIOLOGICAL SYSTEMS XV, 2021, 11593
[37] Face Detection for Crowd Analysis Using Deep Convolutional Neural Networks
Kneis, Bryan
ENGINEERING APPLICATIONS OF NEURAL NETWORKS, EANN 2018, 2018, 893 : 71 - 80
[38] Deep Convolutional Neural Networks with Transfer Learning for Visual Sentiment Analysis
K. Usha Kingsly Devi
V. Gomathi
Neural Processing Letters, 2023, 55 : 5087 - 5120
[39] Robust affect analysis using committee of deep convolutional neural networks
Russel, Newlin Shebiah
Selvaraj, Arivazhagan
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (05): : 3633 - 3645
[40] An Information Analysis Approach into Feature Understanding of Convolutional Deep Neural Networks
Sadeghi, Zahra
MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, 2019, 11943 : 36 - 44

← 1 2 3 4 5 →