Basic filters for convolutional neural networks applied to music: Training or design?

被引:13
|
作者
Doerfler, Monika [1 ]
Grill, Thomas [2 ]
Rammer, Roswitha [1 ]
Flexer, Arthur [2 ]
机构
[1] Univ Vienna, Fac Math, A-1090 Vienna, Austria
[2] Austrian Res Inst Artificial Intelligence OFAI, Freyung 6-6, A-1010 Vienna, Austria
来源
NEURAL COMPUTING & APPLICATIONS | 2020年 / 32卷 / 04期
关键词
Machine learning; Convolutional neural networks; Adaptive filters; Gabor multipliers; Mel-spectrogram; End-to-end learning; OPERATORS;
D O I
10.1007/s00521-018-3704-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When convolutional neural networks are used to tackle learning problems based on music or other time series, raw one-dimensional data are commonly preprocessed to obtain spectrogram or mel-spectrogram coefficients, which are then used as input to the actual neural network. In this contribution, we investigate, both theoretically and experimentally, the influence of this pre-processing step on the network's performance and pose the question whether replacing it by applying adaptive or learned filters directly to the raw data can improve learning success. The theoretical results show that approximately reproducing mel-spectrogram coefficients by applying adaptive filters and subsequent time-averaging on the squared amplitudes is in principle possible. We also conducted extensive experimental work on the task of singing voice detection in music. The results of these experiments show that for classification based on convolutional neural networks the features obtained from adaptive filter banks followed by time-averaging the squared modulus of the filters' output perform better than the canonical Fourier transform-based mel-spectrogram coefficients. Alternative adaptive approaches with center frequencies or time-averaging lengths learned from training data perform equally well.
引用
收藏
页码:941 / 954
页数:14
相关论文
共 50 条
  • [21] LegoNet: Efficient Convolutional Neural Networks with Lego Filters
    Yang, Zhaohui
    Wang, Yunhe
    Chen, Hanting
    Liu, Chuanjian
    Shi, Boxin
    Xu, Chao
    Xu, Chunjing
    Xu, Chang
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [22] Evaluating the Compression Efficiency of the Filters in Convolutional Neural Networks
    Osawa, Kazuki
    Yokota, Rio
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, PT II, 2017, 10614 : 459 - 466
  • [23] Dynamic Local Filters in Graph Convolutional Neural Networks
    Apicella, Andrea
    Isgro, Francesco
    Pollastro, Andrea
    Prevete, Roberto
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT II, 2023, 14234 : 406 - 417
  • [24] Building Correlations Between Filters in Convolutional Neural Networks
    Wang, Hanli
    Chen, Peiqiu
    Kwong, Sam
    IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (10) : 3218 - 3229
  • [25] Improving efficiency in convolutional neural networks with multilinear filters
    Dat Thanh Tran
    Iosifidis, Alexandros
    Gabbouj, Moncef
    NEURAL NETWORKS, 2018, 105 : 328 - 339
  • [26] Learning Versatile Filters for Efficient Convolutional Neural Networks
    Wang, Yunhe
    Xu, Chang
    Xu, Chunjing
    Xu, Chao
    Tao, Dacheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [27] Augmenting Graph Convolutional Neural Networks with Highpass Filters
    Ansarizadeh, Fatemeh
    Tay, David B.
    Thiruvady, Dhananjay
    Robles-Kelly, Antonio
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2020, 2021, 12644 : 77 - 86
  • [28] JOINT TRAINING OF CONVOLUTIONAL AND NON-CONVOLUTIONAL NEURAL NETWORKS
    Soltau, Hagen
    Saon, George
    Sainath, Tara N.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [29] Scaling up the training of Convolutional Neural Networks
    Snir, Marc
    2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 925 - 925
  • [30] Towards dropout training for convolutional neural networks
    Wu, Haibing
    Gu, Xiaodong
    NEURAL NETWORKS, 2015, 71 : 1 - 10