Robust DOA Estimation Using Multi-Scale Fusion Network with Attention Mask

被引:1
|
作者
Yan, Yuting [1 ]
Huang, Qinghua [1 ]
机构
[1] Shanghai Univ, Sch Commun & Informat Engn, Shanghai 200444, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 11期
关键词
complex-valued neural network; direction-of-arrival; reverberant; multi-scale; attention; SPHERICAL MICROPHONE ARRAY; OF-ARRIVAL ESTIMATION; NEURAL-NETWORK; ACOUSTIC ANALYSIS; DIRECTION; LOCALIZATION; ALGORITHM; FRAMEWORK;
D O I
10.3390/app14114488
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
To overcome the limitations of traditional methods in reverberant and noisy environments, a robust multi-scale fusion neural network with attention mask is designed to improve direction-of-arrival (DOA) estimation accuracy for acoustic sources. It combines the benefits of deep learning and complex-valued operations to effectively deal with the interference of reverberation and noise in speech signals. The unique properties of complex-valued signals are exploited to fully capture inherent features and rich information is preserved in the complex field. An attention mask module is designed to generate distinct masks for selectively focusing and masking based on the input. After that, the multi-scale fusion block efficiently captures multi-scale spatial features by stacking complex-valued convolutional layers with small size kernels, and reduces the module complexity through special branching operations. Experimental results demonstrate that the model achieves significant improvements over other methods for speaker localization in reverberant and noisy environments. It provides a new solution for DOA estimation for acoustic sources in different scenarios, which has significant theoretical and practical implications.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] An attention-guided multi-scale fusion network for surgical instrument segmentation
    Song, Mengqiu
    Zhai, Chenxu
    Yang, Lei
    Liu, Yanhong
    Bian, Guibin
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 102
  • [32] MSFFA: a multi-scale feature fusion and attention mechanism network for crowd counting
    Li, Zhaoxin
    Lu, Shuhua
    Dong, Yishan
    Guo, Jingyuan
    VISUAL COMPUTER, 2023, 39 (03): : 1045 - 1056
  • [33] Underwater Image Enhancement Based on Multi-Scale Feature Fusion and Attention Network
    Liu Y.
    Liu M.
    Lin S.
    Tao Z.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (05): : 685 - 695
  • [34] Multi-scale attention fusion network for semantic segmentation of remote sensing images
    Wen, Zhiqiang
    Huang, Hongxu
    Liu, Shuai
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (24) : 7909 - 7926
  • [35] Person Re-identification Based on Multi-scale Network Attention Fusion
    Wang Fenhua
    Zhao Bo
    Huang Chao
    Yan Youqi
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2020, 42 (12) : 3045 - 3052
  • [36] Multi-scale feature fusion pyramid attention network for single image dehazing
    Liu, Jianlei
    Liu, Peng
    Zhang, Yuanke
    IET IMAGE PROCESSING, 2023, 17 (09) : 2726 - 2735
  • [37] Multi-scale Spatial-Spectral Attention Guided Fusion Network for Pansharpening
    Yang, Yong
    Li, Mengzhen
    Huang, Shuying
    Lu, Hangyuan
    Tu, Wei
    Wan, Weiguo
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3346 - 3354
  • [38] Small Object Detection using Multi-scale Feature Fusion and Attention
    Liu, Baokai
    Du, Shiqiang
    Li, Jiacheng
    Wang, Jianhua
    Liu, Wenjie
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 7246 - 7251
  • [39] Detecting herd pigs using multi-scale fusion attention mechanism
    Lin H.
    Zhang K.
    Li H.
    Liu Y.
    Chen Z.
    Ma Q.
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2023, 39 (21): : 188 - 195
  • [40] Multi-scale Attention Aided Multi-Resolution Network for Human Pose Estimation
    Selvam, Srinika
    Mishra, Deepak
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2019, PT I, 2019, 11941 : 461 - 472