Environmental sound classification using temporal-frequency attention based convolutional neural network

被引:0
|
作者
Wenjie Mu
Bo Yin
Xianqing Huang
Jiali Xu
Zehua Du
机构
[1] Ocean University of China,College of Information Science and Engineering
[2] Pilot National Laboratory for Marine Science and Technology,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Environmental sound classification is one of the important issues in the audio recognition field. Compared with structured sounds such as speech and music, the time–frequency structure of environmental sounds is more complicated. In order to learn time and frequency features from Log-Mel spectrogram more effectively, a temporal-frequency attention based convolutional neural network model (TFCNN) is proposed in this paper. Firstly, an experiment that is used as motivation in proposed method is designed to verify the effect of a specific frequency band in the spectrogram on model classification. Secondly, two new attention mechanisms, temporal attention mechanism and frequency attention mechanism, are proposed. These mechanisms can focus on key frequency bands and semantic related time frames on the spectrogram to reduce the influence of background noise and irrelevant frequency bands. Then, a feature information complementarity is formed by combining these mechanisms to more accurately capture the critical time–frequency features. In such a way, the representation ability of the network model can be greatly improved. Finally, experiments on two public data sets, UrbanSound 8 K and ESC-50, demonstrate the effectiveness of the proposed method.
引用
收藏
相关论文
共 50 条
  • [1] Environmental sound classification using temporal-frequency attention based convolutional neural network
    Mu, Wenjie
    Yin, Bo
    Huang, Xianqing
    Xu, Jiali
    Du, Zehua
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [2] Attention based convolutional recurrent neural network for environmental sound classification
    Zhang, Zhichao
    Xu, Shugong
    Zhang, Shunqing
    Qiao, Tianhao
    Cao, Shan
    NEUROCOMPUTING, 2021, 453 (453) : 896 - 903
  • [3] Replay attack detection based on deformable convolutional neural network and temporal-frequency attention model
    Xie, Dang-en
    Hu, Hai-na
    Xu, Qiang
    JOURNAL OF INTELLIGENT SYSTEMS, 2023, 32 (01)
  • [4] A MULTI-CHANNEL TEMPORAL ATTENTION CONVOLUTIONAL NEURAL NETWORK MODEL FOR ENVIRONMENTAL SOUND CLASSIFICATION
    Wang, You
    Feng, Chuyao
    Anderson, David, V
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 930 - 934
  • [5] High Accurate Environmental Sound Classification: Sub-Spectrogram Segmentation versus Temporal-Frequency Attention Mechanism
    Qiao, Tianhao
    Zhang, Shunqing
    Cao, Shan
    Xu, Shugong
    SENSORS, 2021, 21 (16)
  • [6] Polyphonic Sound Event Detection Using Temporal-Frequency Attention and Feature Space Attention
    Jin, Ye
    Wang, Mei
    Luo, Liyan
    Zhao, Dinghao
    Liu, Zhanqi
    SENSORS, 2022, 22 (18)
  • [7] Learning temporal-frequency features of physionet EEG signals using deep convolutional neural network
    Sorkhi, Maryam
    Jahed-Motlagh, Mohammad Reza
    Minaei-Bidgoli, Behrouz
    Daliri, Mohammad Reza
    INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2023, 34 (04):
  • [8] Attention Based Convolutional Neural Network with Multi-frequency Resolution Feature for Environment Sound Classification
    Li, Minze
    Huang, Wu
    Zhang, Tao
    NEURAL PROCESSING LETTERS, 2023, 55 (04) : 4291 - 4306
  • [9] Attention Based Convolutional Neural Network with Multi-frequency Resolution Feature for Environment Sound Classification
    Minze Li
    Wu Huang
    Tao Zhang
    Neural Processing Letters, 2023, 55 : 4291 - 4306
  • [10] Global and Temporal-Frequency Attention Based Network in Audio Deepfake Detection
    Wang C.
    Yi J.
    Tao J.
    Ma H.
    Tian Z.
    Fu R.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2021, 58 (07): : 1466 - 1475