Robust DOA Estimation Using Multi-Scale Fusion Network with Attention Mask

被引:1
|
作者
Yan, Yuting [1 ]
Huang, Qinghua [1 ]
机构
[1] Shanghai Univ, Sch Commun & Informat Engn, Shanghai 200444, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 11期
关键词
complex-valued neural network; direction-of-arrival; reverberant; multi-scale; attention; SPHERICAL MICROPHONE ARRAY; OF-ARRIVAL ESTIMATION; NEURAL-NETWORK; ACOUSTIC ANALYSIS; DIRECTION; LOCALIZATION; ALGORITHM; FRAMEWORK;
D O I
10.3390/app14114488
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
To overcome the limitations of traditional methods in reverberant and noisy environments, a robust multi-scale fusion neural network with attention mask is designed to improve direction-of-arrival (DOA) estimation accuracy for acoustic sources. It combines the benefits of deep learning and complex-valued operations to effectively deal with the interference of reverberation and noise in speech signals. The unique properties of complex-valued signals are exploited to fully capture inherent features and rich information is preserved in the complex field. An attention mask module is designed to generate distinct masks for selectively focusing and masking based on the input. After that, the multi-scale fusion block efficiently captures multi-scale spatial features by stacking complex-valued convolutional layers with small size kernels, and reduces the module complexity through special branching operations. Experimental results demonstrate that the model achieves significant improvements over other methods for speaker localization in reverberant and noisy environments. It provides a new solution for DOA estimation for acoustic sources in different scenarios, which has significant theoretical and practical implications.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Multi-Scale Bilateral Attention Fusion Network For Pansharpening
    Guo Z.
    Li J.
    Lei J.
    Liu J.
    Zhou S.
    Wang B.
    Kasabov N.K.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (11): : 1 - 15
  • [2] Feature Fusion by Attention Networks for Robust DOA Estimation
    Liu, Rongliang
    Zheng, Nengheng
    Chen, Xi
    INTERSPEECH 2021, 2021, : 2157 - 2161
  • [3] Siamese Network with Channel-wise Attention and Multi-scale Fusion for Robust Object Tracking
    Tang, Eryong
    Wang, Yusheng
    Liu, Ye
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 6515 - 6520
  • [4] MFANet: Multi-scale feature fusion network with attention mechanism
    Wang, Gaihua
    Gan, Xin
    Cao, Qingcheng
    Zhai, Qianyu
    VISUAL COMPUTER, 2023, 39 (07): : 2969 - 2980
  • [5] MFANet: Multi-scale feature fusion network with attention mechanism
    Gaihua Wang
    Xin Gan
    Qingcheng Cao
    Qianyu Zhai
    The Visual Computer, 2023, 39 : 2969 - 2980
  • [6] A Multi-Scale Attention Fusion Network for Retinal Vessel Segmentation
    Wang, Shubin
    Chen, Yuanyuan
    Yi, Zhang
    APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [7] MAFormer: A transformer network with multi-scale attention fusion for visual recognition
    Sun, Huixin
    Wang, Yunhao
    Wang, Xiaodi
    Zhang, Bin
    Xin, Ying
    Zhang, Baochang
    Cao, Xianbin
    Ding, Errui
    Han, Shumin
    NEUROCOMPUTING, 2024, 595
  • [8] Pyramid attention object detection network with multi-scale feature fusion
    Chen, Xiu
    Li, Yujie
    Nakatoh, Yoshihisa
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 104
  • [9] JAMFN: Joint Attention Multi-Scale Fusion Network for Depression Detection
    Zhou, Li
    Liu, Zhenyu
    Shangguan, Zixuan
    Yuan, Xiaoyan
    Li, Yutong
    Hu, Bin
    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2023, 2023-August : 3417 - 3421
  • [10] Multi-scale feature fusion network with local attention for lung segmentation
    Xie, Yinghua
    Zhou, Yuntong
    Wang, Chen
    Ma, Yanshan
    Yang, Ming
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 119