Speech Enhancement Performance Based on the MANNER Network Using Feature Fusion

被引:0
|
作者
Wang, Shijie [1 ]
Li, Ji [2 ]
Shao, Lei [2 ]
Liu, Hongli [2 ]
Zhu, Lihua [2 ]
Zhu, Xiaochen [1 ]
机构
[1] Tianjin Univ Technol, Sch Elect Engn & Automat, Tianjin 300384, Peoples R China
[2] Tianjin Key Lab New Energy Power Convers Transmiss, Tianjin 300384, Peoples R China
基金
中国国家自然科学基金;
关键词
speech enhancement; feature fusion; attention mechanisms; U-Net; MANNER;
D O I
10.3390/electronics12081768
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problems that the multi-view attention network for noise erasure (MANNER) cannot take into account are the local and global features in the speech enhancement of long sequences. An attention and feature fusion MANNER (AF-MANNER) network is proposed, which improves the multi-view attention (MA) module in MANNER and replaces the global and local attention in the module. AF-MANNER also designs the feature-weighted fusion module to fuse the features of flash attention and neighborhood attention to enhance the feature expression of the network. The final ablation studies show that this network exhibits a good performance in speech enhancement and that its structure is valuable for improving the intelligibility and perceptual quality of speech.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Reference-Based Speech Enhancement via Feature Alignment and Fusion Network
    Yue, Huanjing
    Duo, Wenxin
    Peng, Xiulian
    Yang, Jingyu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11648 - 11656
  • [2] ROBUST ASR USING NEURAL NETWORK BASED SPEECH ENHANCEMENT AND FEATURE SIMULATION
    Sivasankaran, Sunit
    Nugraha, Aditya Arie
    Vincent, Emmanuel
    Morales-Cordovilla, Juan A.
    Dalmia, Siddharth
    Illina, Irina
    Liutkus, Antoine
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 482 - 489
  • [3] Lip landmark-based audio-visual speech enhancement with multimodal feature fusion network
    Li, Yangke
    Zhang, Xinman
    NEUROCOMPUTING, 2023, 549
  • [4] A Small Object Detection Network Based on Multiple Feature Enhancement and Feature Fusion
    Tan K.
    Ding S.
    Wu S.
    Tian K.
    Ren J.
    Scientific Programming, 2023, 2023
  • [5] Fractional feature-based speech enhancement with deep neural network
    Xu, Liyun
    Zhang, Tong
    SPEECH COMMUNICATION, 2023, 153
  • [6] Improved Transformer-Based Dual-Path Network with Amplitude and Complex Domain Feature Fusion for Speech Enhancement
    Ye, Moujia
    Wan, Hongjie
    ENTROPY, 2023, 25 (02)
  • [7] Convolutional fusion network for monaural speech enhancement
    Xian, Yang
    Sun, Yang
    Wang, Wenwu
    Naqvi, Syed Mohsen
    NEURAL NETWORKS, 2021, 143 : 97 - 107
  • [8] Underwater Image Enhancement Method Based on Feature Fusion Neural Network
    Tian, Yuan
    Xu, Yuang
    Zhou, Jun
    IEEE ACCESS, 2022, 10 : 107536 - 107548
  • [9] CNN-Based Feature Integration Network for Speech Enhancement in Microphone Arrays
    Xi, Ji
    Jiang, Pengxu
    Xie, Yue
    Jiang, Wei
    Ding, Hao
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107 (12) : 1546 - 1549
  • [10] Crowd counting network based on attention feature fusion and multi-column feature enhancement
    Liu, Qian
    Zhong, Yixiong
    Fang, Jiongtao
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 105