AtResNet: Residual Atrous CNN with Multi-scale Feature Representation for Low Complexity Acoustic Scene Classification

被引:0
|
作者
Madhu, Aswathy [1 ,3 ]
Suresh, K. [2 ,3 ]
机构
[1] Coll Engn, Dept Elect & Commun, Thiruvananthapuram 695016, Kerala, India
[2] Govt Engn Coll, Dept Elect & Commun, Wayanad 670644, Kerala, India
[3] APJ Abdul Kalam Technol Univ, Thiruvananthapuram, Kerala, India
关键词
Low complexity ASC; Wavelet transform; Atrous CNN; Residual CNN; DCASE; CONVOLUTIONAL NEURAL-NETWORKS; DATA AUGMENTATION;
D O I
10.1007/s00034-022-02107-2
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Acoustic Scene Classification (ASC) aims to categorize real-world audio into one of the predetermined classes that identifies the recording environment of the audio. State-of-the-art ASC algorithms have excellent performance in terms of accuracy due to the emergence of deep learning algorithms. In particular, Convolutional Neural Networks (CNN) have set a new benchmark in ASC due to their promising performance. Despite the emergence of new frameworks, the interest in ASC is growing progressively with a shift of focus from enhancing accuracy to reducing model complexity. In this work, we introduce the AtResNet, a residual atrous CNN for low complexity acoustic scene classification. The AtResNet utilizes dilated convolutions and residual connections to reduce the number of model parameters. To further enhance the performance of AtResNet, we introduce a multi-scale feature representation method called multi-scale mel spectrogram (ms2). To compute the ms2, we evaluate the mel spectrogram on the wavelet subbands of the signal. We assessed AtResNet with ms2 on three benchmark datasets in ASC. The results suggest that our method significantly outperformed the CNN-based techniques in addition to a baseline system based on log mel spectrum for signal representation. AtResNet offers a 28.73% reduction in the model parameters against a baseline CNN. Furthermore, the AtResNet has a model size of 81 KB with post-training quantization of network weights. It makes AtResNet suitable for deployment in context-aware devices.
引用
收藏
页码:7035 / 7056
页数:22
相关论文
共 50 条
  • [1] AtResNet: Residual Atrous CNN with Multi-scale Feature Representation for Low Complexity Acoustic Scene Classification
    Aswathy Madhu
    K. Suresh
    Circuits, Systems, and Signal Processing, 2022, 41 : 7035 - 7056
  • [2] A CNN-Based Multi-Scale Pooling Strategy for Acoustic Scene Classification
    Huang, Rong
    Xie, Yue
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (01) : 153 - 156
  • [3] Multi-scale semantic feature fusion and data augmentation for acoustic scene classification
    Yang, Liping
    Tao, Lianjie
    Chen, Xinxing
    Gu, Xiaohua
    APPLIED ACOUSTICS, 2020, 163 (163)
  • [4] Multi-scale Bilateral-channels CNN for Scene Classification
    Yuan, Lei
    Hao, Kuangrong
    Tang, Xuesong
    Cai, Xin
    Ding, Yongsheng
    2018 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2018, 10836
  • [5] RQNet: Residual Quaternion CNN for Performance Enhancement in Low Complexity and Device Robust Acoustic Scene Classification
    Madhu, Aswathy
    Suresh, K.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8780 - 8792
  • [6] Acoustic Scene Classification using Convolutional Neural Networks and Multi-Scale Multi-Feature Extraction
    Dang, An
    Vu, Toan H.
    Wang, Jia-Ching
    2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2018,
  • [7] A Multi-Scale Approach for Remote Sensing Scene Classification Based on Feature Maps Selection and Region Representation
    Zhang, Jun
    Zhang, Min
    Shi, Lukui
    Yan, Wenjie
    Pan, Bin
    REMOTE SENSING, 2019, 11 (21)
  • [8] Densely Connected CNN with Multi-scale Feature Attention for Text Classification
    Wang, Shiyao
    Huang, Minlie
    Deng, Zhidong
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4468 - 4474
  • [9] GLOBAL AND MULTI-SCALE FEATURE LEARNING FOR REMOTE SENSING SCENE CLASSIFICATION
    Xia, Ziying
    Gan, Guolong
    Liu, Siyu
    Cao, Wei
    Cheng, Jian
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 655 - 658
  • [10] A multi-scale dense residual correlation network for remote sensing scene classification
    Dai, Wei
    Shi, Furong
    Wang, Xinyu
    Xu, Haixia
    Yuan, Liming
    Wen, Xianbin
    SCIENTIFIC REPORTS, 2024, 14 (01):