Denoising in Representation Space via Data-Dependent Regularization for Better Representation

被引:0
|
作者
Chen, Muyi [1 ,2 ]
Wang, Daling [1 ]
Feng, Shi [1 ]
Zhang, Yifei [1 ]
机构
[1] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110169, Peoples R China
[2] Shenyang Ligong Univ, Sch Automat & Elect Engn, Shenyang 110159, Peoples R China
基金
中国国家自然科学基金;
关键词
deep neural network; representation space; fully connected layer; feature extractor;
D O I
10.3390/math11102327
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Despite the success of deep learning models, it remains challenging for the over-parameterized model to learn good representation under small-sample-size settings. In this paper, motivated by previous work on out-of-distribution (OoD) generalization, we study the representation learning problem from an OoD perspective to identify the fundamental factors affecting representation quality. We formulate a notion of "out-of-feature subspace (OoFS) noise" for the first time, and we link the OoFS noise in the feature extractor to the OoD performance of the model by proving two theorems that demonstrate that reducing OoFS noise in the feature extractor is beneficial in achieving better representation. Moreover, we identify two causes of OoFS noise and prove that the OoFS noise induced by random initialization can be filtered out via L-2 regularization. Finally, we propose a novel data-dependent regularizer that acts on the weights of the fully connected layer to reduce noise in the representations, thus implicitly forcing the feature extractor to focus on informative features and to rely less on noise via back-propagation. Experiments on synthetic datasets show that our method can learn hard-to-learn features; can filter out noise effectively; and outperforms GD, AdaGrad, and KFAC. Furthermore, experiments on the benchmark datasets show that our method achieves the best performance for three tasks among four.
引用
收藏
页数:33
相关论文
共 50 条
  • [21] Robust Regression with Data-Dependent Regularization Parameters and Autoregressive Temporal Correlations
    Wang, Na
    Wang, You-Gan
    Hu, Shuwen
    Hu, Zhi-Hua
    Xu, Jing
    Tang, Hongwu
    Jin, Guangqiu
    ENVIRONMENTAL MODELING & ASSESSMENT, 2018, 23 (06) : 779 - 786
  • [22] Robust Regression with Data-Dependent Regularization Parameters and Autoregressive Temporal Correlations
    Na Wang
    You-Gan Wang
    Shuwen Hu
    Zhi-Hua Hu
    Jing Xu
    Hongwu Tang
    Guangqiu Jin
    Environmental Modeling & Assessment, 2018, 23 : 779 - 786
  • [23] Denoising methods of OBS data based on sparse representation
    Nan FangZhou
    Xu Ya
    Liu Wei
    Liu LiHua
    Hao TianYao
    You QingYu
    CHINESE JOURNAL OF GEOPHYSICS-CHINESE EDITION, 2018, 61 (04): : 1519 - 1528
  • [24] Stop Memorizing: A Data-Dependent Regularization Framework for Intrinsic Pattern Learning
    Zhu, Wei
    Qiu, Qiang
    Wang, Bao
    Lu, Jianfeng
    Sapiro, Guillermo
    Daubechies, Ingrid
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (03): : 476 - 496
  • [25] CLINICAL DATA REPRESENTATION IN MULTIDIMENSIONAL SPACE
    THOMPSON, HK
    WOODBURY, MA
    COMPUTERS AND BIOMEDICAL RESEARCH, 1970, 3 (01): : 58 - &
  • [26] Image denoising via sparse representation using rotational dictionary
    Tang, Yibin
    Xu, Ning
    Jiang, Aimin
    Zhu, Changping
    JOURNAL OF ELECTRONIC IMAGING, 2014, 23 (05)
  • [27] Color image denoising via dictionary learning and sparse representation
    Zhu, Rong
    Wang, Yong
    Journal of Computational and Theoretical Nanoscience, 2015, 12 (10) : 3911 - 3916
  • [28] Image denoising via correlation-based sparse representation
    Gulsher Baloch
    Huseyin Ozkaramanli
    Signal, Image and Video Processing, 2017, 11 : 1501 - 1508
  • [29] Sinogram denoising via simultaneous sparse representation in learned dictionaries
    Karimi, Davood
    Ward, Rabab K.
    PHYSICS IN MEDICINE AND BIOLOGY, 2016, 61 (09): : 3536 - 3553
  • [30] Image denoising via correlation-based sparse representation
    Baloch, Gulsher
    Ozkaramanli, Huseyin
    SIGNAL IMAGE AND VIDEO PROCESSING, 2017, 11 (08) : 1501 - 1508