Environment Sound Classification using Multiple Feature Channels and Attention based Deep Convolutional Neural Network

被引:35
|
作者
Sharma, Jivitesh [1 ]
Granmo, Ole-Christoffer [1 ]
Goodwin, Morten [1 ]
机构
[1] Univ Agder, Dept Informat & Commun Technol, Ctr Artificial Intelligence Res, Kristiansand, Norway
来源
关键词
Convolutional Neural Networks; Attention; Multiple Feature Channels; Environment Sound Classification;
D O I
10.21437/Interspeech.2020-1303
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In this paper, we propose a model for the Environment Sound Classification Task (ESC) that consists of multiple feature channels given as input to a Deep Convolutional Neural Network (CNN) with Attention mechanism. The novelty of the paper lies in using multiple feature channels consisting of Mel-Frequency Cepstral Coefficients (MFCC), Gammatone Frequency Cepstral Coefficients (GFCC), the Constant Q-transform (CQT) and Chromagram. And, we employ a deeper CNN (DCNN) compared to previous models, consisting of spatially separable convolutions working on time and feature domain separately. Alongside, we use attention modules that perform channel and spatial attention together. We use the mix-up data augmentation technique to further boost performance. Our model is able to achieve state-of-the-art performance on three benchmark environment sound classification datasets, i.e. the UrbanSound8K (97.52%), ESC-10 (94.75%) and ESC-50 (87.45%).
引用
收藏
页码:1186 / 1190
页数:5
相关论文
共 50 条
  • [1] Attention Based Convolutional Neural Network with Multi-frequency Resolution Feature for Environment Sound Classification
    Li, Minze
    Huang, Wu
    Zhang, Tao
    NEURAL PROCESSING LETTERS, 2023, 55 (04) : 4291 - 4306
  • [2] Attention Based Convolutional Neural Network with Multi-frequency Resolution Feature for Environment Sound Classification
    Minze Li
    Wu Huang
    Tao Zhang
    Neural Processing Letters, 2023, 55 : 4291 - 4306
  • [3] Environment Sound Classification System Based on Hybrid Feature and Convolutional Neural Network
    Zhang K.
    Su Y.
    Wang J.
    Wang S.
    Zhang Y.
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2020, 38 (01): : 162 - 169
  • [4] Environment sound classification using an attention-based residual neural network
    Tripathi, Achyut Mani
    Mishra, Aakansha
    NEUROCOMPUTING, 2021, 460 : 409 - 423
  • [5] Attention based convolutional recurrent neural network for environmental sound classification
    Zhang, Zhichao
    Xu, Shugong
    Zhang, Shunqing
    Qiao, Tianhao
    Cao, Shan
    NEUROCOMPUTING, 2021, 453 (453) : 896 - 903
  • [6] Sound Classification Using Convolutional Neural Network and Tensor Deep Stacking Network
    Khamparia, Aditya
    Gupta, Deepak
    Nhu Gia Nguyen
    Khanna, Ashish
    Pandey, Babita
    Tiwari, Prayag
    IEEE ACCESS, 2019, 7 : 7717 - 7727
  • [7] POLSAR IMAGE CLASSIFICATION USING ATTENTION BASED SHALLOW TO DEEP CONVOLUTIONAL NEURAL NETWORK
    Alkhatib, Mohammed Q.
    Al-Saad, Mina
    Aburaed, Nour
    Zitouni, M. Sami
    Al-Ahmad, Hussain
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 8034 - 8037
  • [8] Environmental sound classification using temporal-frequency attention based convolutional neural network
    Mu, Wenjie
    Yin, Bo
    Huang, Xianqing
    Xu, Jiali
    Du, Zehua
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [9] Environmental sound classification using temporal-frequency attention based convolutional neural network
    Wenjie Mu
    Bo Yin
    Xianqing Huang
    Jiali Xu
    Zehua Du
    Scientific Reports, 11
  • [10] Singer Gender Classification using Feature-based and Spectrograms with Deep Convolutional Neural Network
    Jitendra, Mukkamala S. N., V
    Radhika, Y.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (02) : 135 - 144