Spectral Batch Normalization: Normalization in the Frequency Domain

被引:0
|
作者
Cakaj, Rinor [1 ,2 ]
Mehnert, Jens [1 ]
Yang, Bin [3 ]
机构
[1] Robert Bosch GmbH, Signal Proc, D-71229 Leonberg, Germany
[2] Univ Stuttgart, D-71229 Leonberg, Germany
[3] Univ Stuttgart, ISS, D-70550 Stuttgart, Germany
来源
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年
关键词
NEURAL-NETWORKS;
D O I
10.1109/IJCNN54540.2023.10191931
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Regularization is a set of techniques that are used to improve the generalization ability of deep neural networks. In this paper, we introduce spectral batch normalization (SBN), a novel effective method to improve generalization by normalizing feature maps in the frequency (spectral) domain. The activations of residual networks without batch normalization (BN) tend to explode exponentially in the depth of the network at initialization. This leads to extremely large feature map norms even though the parameters are relatively small. These explosive dynamics can be very detrimental to learning. BN makes weight decay regularization on the scaling factors gamma, beta approximately equivalent to an additive penalty on the norm of the feature maps, which prevents extremely large feature map norms to a certain degree. It was previously shown that preventing explosive growth at the final layer at initialization and during training in ResNets can recover a large part of Batch Normalization's generalization boost. However, we show experimentally that, despite the approximate additive penalty of BN, feature maps in deep neural networks (DNNs) tend to explode at the beginning of the training and that feature maps of DNNs contain large values during the whole training. This phenomenon also occurs in a weakened form in non-residual networks. Intuitively, it is not preferred to have large values in feature maps since they have too much influence on the prediction in contrast to other parts of the feature map. SBN addresses large feature maps by normalizing them in the frequency domain. In our experiments, we empirically show that SBN prevents exploding feature maps at initialization and large feature map values during the training. Moreover, the normalization of feature maps in the frequency domain leads to more uniform distributed frequency components. This discourages the DNNs to rely on single frequency components of feature maps. These, together with other effects (e.g. noise injection, scaling and shifting of the feature map) of SBN, have a regularizing effect on the training of residual and non-residual networks. We show experimentally that using SBN in addition to standard regularization methods improves the performance of DNNs by a relevant margin, e.g. ResNet50 on CIFAR-100 by 2.31%, on ImageNet by 0.71% (from 76.80% to 77.51%) and VGG19 on CIFAR-100 by 0.66%.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Face recognition based on illumination normalization in frequency-domain
    School of Computer Science, Sichuan University, Chengdu 610065, China
    Dianzi Keji Diaxue Xuebao, 2009, 6 (1021-1025):
  • [22] Cross-Iteration Batch Normalization
    Yao, Zhuliang
    Cao, Yue
    Zheng, Shuxin
    Huang, Gao
    Lin, Stephen
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12326 - 12335
  • [23] Inference-Invariant Transformation of Batch Normalization for Domain Adaptation of Acoustic Models
    Suzuki, Masayuki
    Nagano, Tohru
    Kurata, Gakuto
    Thomas, Samuel
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2893 - 2897
  • [24] Representative Batch Normalization with Feature Calibration
    Gao, Shang-Hua
    Han, Qi
    Li, Duo
    Cheng, Ming-Ming
    Peng, Pai
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8665 - 8675
  • [25] Leveraging Batch Normalization for Vision Transformers
    Yao, Zhuliang
    Cao, Yue
    Lin, Yutong
    Liu, Ze
    Zhang, Zheng
    Hu, Han
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 413 - 422
  • [26] D-BIN: A Generalized Disentangling Batch Instance Normalization for Domain Adaptation
    Chen, Yurong
    Zhang, Hui
    Wang, Yaonan
    Peng, Weixing
    Zhang, Wangdong
    Wu, Q. M. Jonathan
    Yang, Yimin
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (04) : 2151 - 2163
  • [27] Patch-Aware Batch Normalization for Improving Cross-Domain Robustness
    Qi, Lei
    Zhao, Dongjia
    Shi, Yinghuan
    Geng, Xin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 800 - 810
  • [28] SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG
    Kobler, Reinmar J.
    Hirayama, Jun-ichiro
    Zhao, Qibin
    Kawanabe, Motoaki
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [29] The Effect of Batch Normalization in the Symmetric Phase
    Takagi, Shiro
    Yoshida, Yuki
    Okada, Masato
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 229 - 240
  • [30] Adversarial Attacks and Batch Normalization: A Batch Statistics Perspective
    Muhammad, Awais
    Shamshad, Fahad
    Bae, Sung-Ho
    IEEE ACCESS, 2023, 11 : 96449 - 96459