Environmental sound classification using temporal-frequency attention based convolutional neural network

被引:0
|
作者
Wenjie Mu
Bo Yin
Xianqing Huang
Jiali Xu
Zehua Du
机构
[1] Ocean University of China,College of Information Science and Engineering
[2] Pilot National Laboratory for Marine Science and Technology,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Environmental sound classification is one of the important issues in the audio recognition field. Compared with structured sounds such as speech and music, the time–frequency structure of environmental sounds is more complicated. In order to learn time and frequency features from Log-Mel spectrogram more effectively, a temporal-frequency attention based convolutional neural network model (TFCNN) is proposed in this paper. Firstly, an experiment that is used as motivation in proposed method is designed to verify the effect of a specific frequency band in the spectrogram on model classification. Secondly, two new attention mechanisms, temporal attention mechanism and frequency attention mechanism, are proposed. These mechanisms can focus on key frequency bands and semantic related time frames on the spectrogram to reduce the influence of background noise and irrelevant frequency bands. Then, a feature information complementarity is formed by combining these mechanisms to more accurately capture the critical time–frequency features. In such a way, the representation ability of the network model can be greatly improved. Finally, experiments on two public data sets, UrbanSound 8 K and ESC-50, demonstrate the effectiveness of the proposed method.
引用
收藏
相关论文
共 50 条
  • [31] ENVIRONMENTAL SOUND CLASSIFICATION WITH CONVOLUTIONAL NEURAL NETWORKS
    Piczak, Karol J.
    2015 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2015,
  • [32] A Visual Attention Based Convolutional Neural Network for Image Classification
    Chen, Yaran
    Zhao, Dongbin
    Lv, Le
    Li, Chengdong
    PROCEEDINGS OF THE 2016 12TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2016, : 764 - 769
  • [33] Text Classification Based on Convolutional Neural Network and Attention Model
    Yang, Shuang
    Tang, Yan
    2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2020), 2020, : 67 - 73
  • [34] Deep Convolutional Neural Network Combined with Concatenated Spectrogram for Environmental Sound Classification
    Chi, Zhejian
    Li, Ying
    Chen, Cheng
    PROCEEDINGS OF 2019 IEEE 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2019), 2019, : 251 - 254
  • [35] Sound Classification Using Convolutional Neural Network and Tensor Deep Stacking Network
    Khamparia, Aditya
    Gupta, Deepak
    Nhu Gia Nguyen
    Khanna, Ashish
    Pandey, Babita
    Tiwari, Prayag
    IEEE ACCESS, 2019, 7 : 7717 - 7727
  • [36] SAR Image Classification Using Gated Channel Attention Based Convolutional Neural Network
    Zhang, Anjun
    Jia, Lu
    Wang, Jun
    Wang, Chuanjian
    REMOTE SENSING, 2023, 15 (02)
  • [37] POLSAR IMAGE CLASSIFICATION USING ATTENTION BASED SHALLOW TO DEEP CONVOLUTIONAL NEURAL NETWORK
    Alkhatib, Mohammed Q.
    Al-Saad, Mina
    Aburaed, Nour
    Zitouni, M. Sami
    Al-Ahmad, Hussain
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 8034 - 8037
  • [38] Handwritten/Printed Receipt Classification using Attention-Based Convolutional Neural Network
    Yang, Fan
    Jin, Lianwen
    Yang, Weixin
    Feng, Ziyong
    Zhang, Shuye
    PROCEEDINGS OF 2016 15TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2016, : 384 - 389
  • [39] Attention-Based Temporal-Frequency Aggregation for Speaker Verification
    Wang, Meng
    Feng, Dazheng
    Su, Tingting
    Chen, Mohan
    SENSORS, 2022, 22 (06)
  • [40] End-to-end environmental sound classification using a 1D convolutional neural network
    Abdoli, Sajjad
    Cardinal, Patrick
    Koerich, Alessandro Lameiras
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 136 : 252 - 263