Acoustic scene classification method based on multi-stream convolution and data augmentation

被引：0

作者：

Cao Y. ^{[1
,2
]}

Fei H. ^{[1
,2
]}

Li P. ^{[1
,2
]}

Zhang X. ^{[1
,2
]}

机构：

[1] School of Mechanical Engineering, Jiangnan University, Wuxi

[2] Jiangsu Key Laboratory of Advanced Food Manufacturing Equipment and Technology, Jiangnan University, Wuxi

来源：

Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition) | 2022年 / 50卷 / 04期

关键词：

Acoustic scene classification; Multi-dimension mixup data augmentation; Multi-stream convolutional neural network; Multi-stream feature fusion; Over-fitting;

D O I：

10.13245/j.hust.220407

中图分类号：

学科分类号：

摘要：

An acoustic scene classification method based on multi-stream convolution and data augmentation was proposed for the problem of insufficient acoustic scene classification accuracy and generalization ability of existing models while applying single feature input. First, the specific working principles of the convolutional neural network and mixup data augmentation were introduced. Then, based on the network parallel input theory, a multi-stream convolutional neural network was designed to achieve multi-stream feature fusion, which consisted of feature extraction module and feature fusion module. Furthermore, to further promote the accuracy of model and reduce the probability of over-fitting problem, a multi-dimension mixup data generate method was applied to smooth the feature data. Finally, nine feature combination schemes were used to carry out acoustic scene classification experiments based on Urbansound8K, ESC50 and ESC10 datasets. Experimental results show that the accuracy of the model is 88.29%, 77.75 and 96.25% respectively, which verifies that the model has higher accuracy and stronger generalization ability when using this method for acoustic scene classification research. © 2022, Editorial Board of Journal of Huazhong University of Science and Technology. All right reserved.

引用

页码：40 / 46

页数：6

共 20 条

[1] SU Y, ZHANG K, WANG J, Et al., Performance analysis of multiple aggregated acoustic features for environment sound classification
[2] 46, 6, (2019)
[3] BARCHIESI D, GIANNOULIS D, STOWELL D, Et al., Acoustic scene classification: classifying environments from the sounds they produce, IEEE Signal Processing Magazine, 32, 3, pp. 16-34, (2015)
[4] RAMY H, KHALED BS, AYMAN H., Robust feature extraction and classification of acoustic partial discharge signals corrupted with noise, IEEE Transactions on Instrumentation and Measurement, 66, 3, pp. 405-413, (2017)
[5] 35, (2018)
[6] (2018)
[7] BHUIYAN M Y, BAO J, PODDAR B, Et al., Toward identifying crack-length-related resonances in acoustic emission waveforms for structural health monitoring applications, Structural Health Monitoring, 17, 3, pp. 577-585, (2018)
[8] SALAMON J, JACOBY C, BELLO J P., A dataset and taxonomy for urban sound research, Proc of the 22nd ACM International Conference on Multimedia, pp. 1041-1044, (2014)
[9] 40, 1
[10] 38, 1

← 1 2 →