Basic filters for convolutional neural networks applied to music: Training or design?

被引：13

作者：

Doerfler, Monika ^{[1
]}

Grill, Thomas ^{[2
]}

Rammer, Roswitha ^{[1
]}

Flexer, Arthur ^{[2
]}

机构：

[1] Univ Vienna, Fac Math, A-1090 Vienna, Austria

[2] Austrian Res Inst Artificial Intelligence OFAI, Freyung 6-6, A-1010 Vienna, Austria

来源：

NEURAL COMPUTING & APPLICATIONS | 2020年 / 32卷 / 04期

关键词：

Machine learning; Convolutional neural networks; Adaptive filters; Gabor multipliers; Mel-spectrogram; End-to-end learning; OPERATORS;

D O I：

10.1007/s00521-018-3704-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

When convolutional neural networks are used to tackle learning problems based on music or other time series, raw one-dimensional data are commonly preprocessed to obtain spectrogram or mel-spectrogram coefficients, which are then used as input to the actual neural network. In this contribution, we investigate, both theoretically and experimentally, the influence of this pre-processing step on the network's performance and pose the question whether replacing it by applying adaptive or learned filters directly to the raw data can improve learning success. The theoretical results show that approximately reproducing mel-spectrogram coefficients by applying adaptive filters and subsequent time-averaging on the squared amplitudes is in principle possible. We also conducted extensive experimental work on the task of singing voice detection in music. The results of these experiments show that for classification based on convolutional neural networks the features obtained from adaptive filter banks followed by time-averaging the squared modulus of the filters' output perform better than the canonical Fourier transform-based mel-spectrogram coefficients. Alternative adaptive approaches with center frequencies or time-averaging lengths learned from training data perform equally well.

引用

页码：941 / 954

页数：14

共 50 条

[21] LegoNet: Efficient Convolutional Neural Networks with Lego Filters
Yang, Zhaohui
Wang, Yunhe
Chen, Hanting
Liu, Chuanjian
Shi, Boxin
Xu, Chao
Xu, Chunjing
Xu, Chang
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[22] Evaluating the Compression Efficiency of the Filters in Convolutional Neural Networks
Osawa, Kazuki
Yokota, Rio
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, PT II, 2017, 10614 : 459 - 466
[23] Dynamic Local Filters in Graph Convolutional Neural Networks
Apicella, Andrea
Isgro, Francesco
Pollastro, Andrea
Prevete, Roberto
IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT II, 2023, 14234 : 406 - 417
[24] Building Correlations Between Filters in Convolutional Neural Networks
Wang, Hanli
Chen, Peiqiu
Kwong, Sam
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (10) : 3218 - 3229
[25] Improving efficiency in convolutional neural networks with multilinear filters
Dat Thanh Tran
Iosifidis, Alexandros
Gabbouj, Moncef
NEURAL NETWORKS, 2018, 105 : 328 - 339
[26] Learning Versatile Filters for Efficient Convolutional Neural Networks
Wang, Yunhe
Xu, Chang
Xu, Chunjing
Xu, Chao
Tao, Dacheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[27] Augmenting Graph Convolutional Neural Networks with Highpass Filters
Ansarizadeh, Fatemeh
Tay, David B.
Thiruvady, Dhananjay
Robles-Kelly, Antonio
STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2020, 2021, 12644 : 77 - 86
[28] JOINT TRAINING OF CONVOLUTIONAL AND NON-CONVOLUTIONAL NEURAL NETWORKS
Soltau, Hagen
Saon, George
Sainath, Tara N.
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[29] Scaling up the training of Convolutional Neural Networks
Snir, Marc
2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 925 - 925
[30] Towards dropout training for convolutional neural networks
Wu, Haibing
Gu, Xiaodong
NEURAL NETWORKS, 2015, 71 : 1 - 10

← 1 2 3 4 5 →