Acoustic event recognition using cochleagram image and convolutional neural networks

被引：42

作者：

Sharan, Roneel V. ^{[1
]}

Moir, Tom J. ^{[2
]}

机构：

[1] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld 4072, Australia

[2] Auckland Univ Technol, Sch Engn, Private Bag 92006, Auckland 1142, New Zealand

来源：

APPLIED ACOUSTICS | 2019年 / 148卷

关键词：

Acoustic event recognition; Cochleagram; Convolutional neural network; Mel-spectrogram; Spectrogram; FEATURES; CLASSIFICATION;

D O I：

10.1016/j.apacoust.2018.12.006

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Convolutional neural networks (CNN) have produced encouraging results in image classification tasks and have been increasingly adopted in audio classification applications. However, in using CNN for acoustic event recognition, the first hurdle is finding the best image representation of an audio signal. In this work, we evaluate the performance of four time-frequency representations for use with CNN. Firstly, we consider the conventional spectrogram image. Secondly, we apply moving average to the spectrogram along the frequency domain to obtain what we refer as the smoothed spectrogram. Thirdly, we use the mel-spectrogram which utilizes the mel-filter, as in mel-frequency cepstral coefficients. Finally, we propose the use of a cochleagram image the frequency components of which are based on the frequency selectivity property of the human cochlea. We test the proposed techniques on an acoustic event database containing 50 sound classes. The results show that the proposed cochleagram time-frequency image representation gives the best classification performance when used with CNN. (C) 2018 Elsevier Ltd. All rights reserved.

引用

页码：62 / 66

页数：5

共 50 条

[41] Object-Scene Convolutional Neural Networks for Event Recognition in Images
Wang, Limin
Wang, Zhe
Du, Wenbin
Qiao, Yu
2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2015,
[42] Accuracy Improvement of Thai Food Image Recognition Using Deep Convolutional Neural Networks
Termritthikun, Chakkrit
Kanprachar, Surachet
2017 INTERNATIONAL ELECTRICAL ENGINEERING CONGRESS (IEECON), 2017,
[43] Convolutional Neural Networks for Image Recognition in Mixed Reality Using Voice Command Labeling
Hoppenstedt, Burkhard
Kammerer, Klaus
Reichert, Manfred
Spiliopoulou, Myra
Pryss, Ruediger
AUGMENTED REALITY, VIRTUAL REALITY, AND COMPUTER GRAPHICS (AVR 2019), PT II, 2019, 11614 : 63 - 70
[44] FAST ACOUSTIC SCATTERING USING CONVOLUTIONAL NEURAL NETWORKS
Fan, Ziqi
Vineet, Vibhav
Gamper, Hannes
Raghuvanshi, Nikunj
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 171 - 175
[45] Acoustic-based LEGO recognition using attention-based convolutional neural networks
Tran, Van-Thuan
Wu, Chia-Yang
Tsai, Wei-Ho
ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (01)
[46] Underwater single-channel acoustic signal multitarget recognition using convolutional neural networks
Sun, Qinggang
Wang, Kejun
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2022, 151 (03): : 2245 - 2254
[47] Acoustic-based LEGO recognition using attention-based convolutional neural networks
Van-Thuan Tran
Chia-Yang Wu
Wei-Ho Tsai
Artificial Intelligence Review, 2024, 57
[48] Underwater single-channel acoustic signal multitarget recognition using convolutional neural networks
Sun, Qinggang
Wang, Kejun
Journal of the Acoustical Society of America, 2022, 151 (03): : 2245 - 2254
[49] Robust Place Recognition using Convolutional Neural Networks
Lugo Sanchez, Omar E.
Sossa, Humberto
Zamora, Erik
COMPUTACION Y SISTEMAS, 2020, 24 (04): : 1589 - 1605
[50] Facial Emotion Recognition using Convolutional Neural Networks
Rzayeva, Zeynab
Alasgarov, Emin
2019 IEEE 13TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT 2019), 2019, : 91 - 95

← 1 2 3 4 5 →