Environmental sound classification using temporal-frequency attention based convolutional neural network

被引:0
|
作者
Wenjie Mu
Bo Yin
Xianqing Huang
Jiali Xu
Zehua Du
机构
[1] Ocean University of China,College of Information Science and Engineering
[2] Pilot National Laboratory for Marine Science and Technology,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Environmental sound classification is one of the important issues in the audio recognition field. Compared with structured sounds such as speech and music, the time–frequency structure of environmental sounds is more complicated. In order to learn time and frequency features from Log-Mel spectrogram more effectively, a temporal-frequency attention based convolutional neural network model (TFCNN) is proposed in this paper. Firstly, an experiment that is used as motivation in proposed method is designed to verify the effect of a specific frequency band in the spectrogram on model classification. Secondly, two new attention mechanisms, temporal attention mechanism and frequency attention mechanism, are proposed. These mechanisms can focus on key frequency bands and semantic related time frames on the spectrogram to reduce the influence of background noise and irrelevant frequency bands. Then, a feature information complementarity is formed by combining these mechanisms to more accurately capture the critical time–frequency features. In such a way, the representation ability of the network model can be greatly improved. Finally, experiments on two public data sets, UrbanSound 8 K and ESC-50, demonstrate the effectiveness of the proposed method.
引用
收藏
相关论文
共 50 条
  • [21] Multi-stream Network With Temporal Attention For Environmental Sound Classification
    Li, Xinyu
    Chebiyyam, Venkata
    Kirchhoff, Katrin
    INTERSPEECH 2019, 2019, : 3604 - 3608
  • [22] Deep Convolutional Neural Network with Transfer Learning for Environmental Sound Classification
    Lu, Jianrui
    Ma, Ruofei
    Liu, Gongliang
    Qin, Zhiliang
    2021 INTERNATIONAL CONFERENCE ON COMPUTER, CONTROL AND ROBOTICS (ICCCR 2021), 2021, : 242 - 245
  • [23] Hand Gesture Classification Based on Nonaudible Sound Using Convolutional Neural Network
    Kim, Jinhyuck
    Choi, Sunwoong
    JOURNAL OF SENSORS, 2019, 2019
  • [24] Deep convolutional neural network for environmental sound classification via dilation
    Roy, Sanjiban Sekhar
    Mihalache, Sanda Florentina
    Pricop, Emil
    Rodrigues, Nishant
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (02) : 1827 - 1833
  • [25] Convolutional neural network based amphibian sound classification using covariance and modulogram
    Ko, Kyungdeuk
    Park, Sangwook
    Ko, Hanseok
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2018, 37 (01): : 60 - 65
  • [26] Concatenation-based pre-trained convolutional neural networks using attention mechanism for environmental sound classification
    Ashurov, Asadulla
    Yi, Zhou
    Liu, Hongqing
    Yu, Zhao
    Li, Manhai
    APPLIED ACOUSTICS, 2024, 216
  • [27] EEG-based Classification of Drivers Attention using Convolutional Neural Network
    Atilla, Fred
    Alimardani, Maryam
    PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON HUMAN-MACHINE SYSTEMS (ICHMS), 2021, : 59 - 62
  • [28] Environmental Sound Classification Based on Multi-temporal Resolution Convolutional Neural Network Combining with Multi-level Features
    Zhu, Boqing
    Xu, Kele
    Wang, Dezhi
    Zhang, Lilun
    Li, Bo
    Peng, Yuxing
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 528 - 537
  • [29] Environment sound classification using an attention-based residual neural network
    Tripathi, Achyut Mani
    Mishra, Aakansha
    NEUROCOMPUTING, 2021, 460 : 409 - 423
  • [30] DOCUMENT CLASSIFICATION BASED ON CONVOLUTIONAL NEURAL NETWORK AND HIERARCHICAL ATTENTION NETWORK
    Cheng, Y.
    Ye, Z.
    Wang, M.
    Zhang, Q.
    NEURAL NETWORK WORLD, 2019, 29 (02) : 83 - 98