Acoustic event recognition using cochleagram image and convolutional neural networks

被引:42
|
作者
Sharan, Roneel V. [1 ]
Moir, Tom J. [2 ]
机构
[1] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld 4072, Australia
[2] Auckland Univ Technol, Sch Engn, Private Bag 92006, Auckland 1142, New Zealand
关键词
Acoustic event recognition; Cochleagram; Convolutional neural network; Mel-spectrogram; Spectrogram; FEATURES; CLASSIFICATION;
D O I
10.1016/j.apacoust.2018.12.006
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Convolutional neural networks (CNN) have produced encouraging results in image classification tasks and have been increasingly adopted in audio classification applications. However, in using CNN for acoustic event recognition, the first hurdle is finding the best image representation of an audio signal. In this work, we evaluate the performance of four time-frequency representations for use with CNN. Firstly, we consider the conventional spectrogram image. Secondly, we apply moving average to the spectrogram along the frequency domain to obtain what we refer as the smoothed spectrogram. Thirdly, we use the mel-spectrogram which utilizes the mel-filter, as in mel-frequency cepstral coefficients. Finally, we propose the use of a cochleagram image the frequency components of which are based on the frequency selectivity property of the human cochlea. We test the proposed techniques on an acoustic event database containing 50 sound classes. The results show that the proposed cochleagram time-frequency image representation gives the best classification performance when used with CNN. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:62 / 66
页数:5
相关论文
共 50 条
  • [1] Time-Frequency Image Resizing Using Interpolation for Acoustic Event Recognition with Convolutional Neural Networks
    Sharan, Roneel V.
    Moir, Tom J.
    2019 IEEE INTERNATIONAL CONFERENCE ON SIGNALS AND SYSTEMS (ICSIGSYS), 2019, : 8 - 11
  • [2] Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition
    Takahashi, Naoya
    Gygli, Michael
    Pfister, Beat
    Van Goole, Luc
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2982 - 2986
  • [3] ROBUST SOUND EVENT RECOGNITION USING CONVOLUTIONAL NEURAL NETWORKS
    Zhang, Haomin
    McLoughlin, Ian
    Song, Yan
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 559 - 563
  • [4] Image recognition in UAV videos using convolutional neural networks
    Quinonez, Yadira
    Lizarraga, Carmen
    Peraza, Juan
    Zatarain, Oscar
    IET SOFTWARE, 2020, 14 (02) : 176 - 181
  • [5] Prosodic Event Recognition using Convolutional Neural Networks with Context Information
    Stehwien, Sabrina
    Ngoc Thang Vu
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2326 - 2330
  • [6] Acoustic Pornography Recognition Using Convolutional Neural Networks and Bag of Refinements
    Zhou, Lifeng
    Wei, Kaifeng
    Li, Yuke
    Hao, Yiya
    Yang, Weiqiang
    Zhu, Haoqi
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 840 - 845
  • [7] Improved Convolutional Neural Networks for Acoustic Event Classification
    Tang, Guichen
    Liang, Ruiyu
    Xie, Yue
    Bao, Yongqiang
    Wang, Shijia
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (12) : 15801 - 15816
  • [8] Improved Convolutional Neural Networks for Acoustic Event Classification
    Guichen Tang
    Ruiyu Liang
    Yue Xie
    Yongqiang Bao
    Shijia Wang
    Multimedia Tools and Applications, 2019, 78 : 15801 - 15816
  • [9] Food Image Recognition with Convolutional Neural Networks
    Zhang, Weishan
    Zhao, Dehai
    Gong, Wenjuan
    Li, Zhongwei
    Lu, Qinghua
    Yang, Su
    IEEE 12TH INT CONF UBIQUITOUS INTELLIGENCE & COMP/IEEE 12TH INT CONF ADV & TRUSTED COMP/IEEE 15TH INT CONF SCALABLE COMP & COMMUN/IEEE INT CONF CLOUD & BIG DATA COMP/IEEE INT CONF INTERNET PEOPLE AND ASSOCIATED SYMPOSIA/WORKSHOPS, 2015, : 690 - 693
  • [10] An Analysis of Convolutional Neural Networks for Image Recognition
    He, Jun
    Liu, Yue
    Li, Shuai
    Shen, Jin-ming
    2017 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL MODELING, SIMULATION AND APPLIED MATHEMATICS (CMSAM), 2017, : 524 - 528