Acoustic event recognition using cochleagram image and convolutional neural networks

被引:42
|
作者
Sharan, Roneel V. [1 ]
Moir, Tom J. [2 ]
机构
[1] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld 4072, Australia
[2] Auckland Univ Technol, Sch Engn, Private Bag 92006, Auckland 1142, New Zealand
关键词
Acoustic event recognition; Cochleagram; Convolutional neural network; Mel-spectrogram; Spectrogram; FEATURES; CLASSIFICATION;
D O I
10.1016/j.apacoust.2018.12.006
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Convolutional neural networks (CNN) have produced encouraging results in image classification tasks and have been increasingly adopted in audio classification applications. However, in using CNN for acoustic event recognition, the first hurdle is finding the best image representation of an audio signal. In this work, we evaluate the performance of four time-frequency representations for use with CNN. Firstly, we consider the conventional spectrogram image. Secondly, we apply moving average to the spectrogram along the frequency domain to obtain what we refer as the smoothed spectrogram. Thirdly, we use the mel-spectrogram which utilizes the mel-filter, as in mel-frequency cepstral coefficients. Finally, we propose the use of a cochleagram image the frequency components of which are based on the frequency selectivity property of the human cochlea. We test the proposed techniques on an acoustic event database containing 50 sound classes. The results show that the proposed cochleagram time-frequency image representation gives the best classification performance when used with CNN. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:62 / 66
页数:5
相关论文
共 50 条
  • [41] Object-Scene Convolutional Neural Networks for Event Recognition in Images
    Wang, Limin
    Wang, Zhe
    Du, Wenbin
    Qiao, Yu
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2015,
  • [42] Accuracy Improvement of Thai Food Image Recognition Using Deep Convolutional Neural Networks
    Termritthikun, Chakkrit
    Kanprachar, Surachet
    2017 INTERNATIONAL ELECTRICAL ENGINEERING CONGRESS (IEECON), 2017,
  • [43] Convolutional Neural Networks for Image Recognition in Mixed Reality Using Voice Command Labeling
    Hoppenstedt, Burkhard
    Kammerer, Klaus
    Reichert, Manfred
    Spiliopoulou, Myra
    Pryss, Ruediger
    AUGMENTED REALITY, VIRTUAL REALITY, AND COMPUTER GRAPHICS (AVR 2019), PT II, 2019, 11614 : 63 - 70
  • [44] FAST ACOUSTIC SCATTERING USING CONVOLUTIONAL NEURAL NETWORKS
    Fan, Ziqi
    Vineet, Vibhav
    Gamper, Hannes
    Raghuvanshi, Nikunj
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 171 - 175
  • [45] Acoustic-based LEGO recognition using attention-based convolutional neural networks
    Tran, Van-Thuan
    Wu, Chia-Yang
    Tsai, Wei-Ho
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (01)
  • [46] Underwater single-channel acoustic signal multitarget recognition using convolutional neural networks
    Sun, Qinggang
    Wang, Kejun
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2022, 151 (03): : 2245 - 2254
  • [47] Acoustic-based LEGO recognition using attention-based convolutional neural networks
    Van-Thuan Tran
    Chia-Yang Wu
    Wei-Ho Tsai
    Artificial Intelligence Review, 2024, 57
  • [48] Underwater single-channel acoustic signal multitarget recognition using convolutional neural networks
    Sun, Qinggang
    Wang, Kejun
    Journal of the Acoustical Society of America, 2022, 151 (03): : 2245 - 2254
  • [49] Robust Place Recognition using Convolutional Neural Networks
    Lugo Sanchez, Omar E.
    Sossa, Humberto
    Zamora, Erik
    COMPUTACION Y SISTEMAS, 2020, 24 (04): : 1589 - 1605
  • [50] Facial Emotion Recognition using Convolutional Neural Networks
    Rzayeva, Zeynab
    Alasgarov, Emin
    2019 IEEE 13TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT 2019), 2019, : 91 - 95