Speech Enhancement Performance Based on the MANNER Network Using Feature Fusion

被引：0

作者：

Wang, Shijie ^{[1
]}

Li, Ji ^{[2
]}

Shao, Lei ^{[2
]}

Liu, Hongli ^{[2
]}

Zhu, Lihua ^{[2
]}

Zhu, Xiaochen ^{[1
]}

机构：

[1] Tianjin Univ Technol, Sch Elect Engn & Automat, Tianjin 300384, Peoples R China

[2] Tianjin Key Lab New Energy Power Convers Transmiss, Tianjin 300384, Peoples R China

来源：

ELECTRONICS | 2023年 / 12卷 / 08期

基金：

中国国家自然科学基金;

关键词：

speech enhancement; feature fusion; attention mechanisms; U-Net; MANNER;

D O I：

10.3390/electronics12081768

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The problems that the multi-view attention network for noise erasure (MANNER) cannot take into account are the local and global features in the speech enhancement of long sequences. An attention and feature fusion MANNER (AF-MANNER) network is proposed, which improves the multi-view attention (MA) module in MANNER and replaces the global and local attention in the module. AF-MANNER also designs the feature-weighted fusion module to fuse the features of flash attention and neighborhood attention to enhance the feature expression of the network. The final ablation studies show that this network exhibits a good performance in speech enhancement and that its structure is valuable for improving the intelligibility and perceptual quality of speech.

引用

页数：13

共 50 条

[1] Reference-Based Speech Enhancement via Feature Alignment and Fusion Network
Yue, Huanjing
Duo, Wenxin
Peng, Xiulian
Yang, Jingyu
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11648 - 11656
[2] ROBUST ASR USING NEURAL NETWORK BASED SPEECH ENHANCEMENT AND FEATURE SIMULATION
Sivasankaran, Sunit
Nugraha, Aditya Arie
Vincent, Emmanuel
Morales-Cordovilla, Juan A.
Dalmia, Siddharth
Illina, Irina
Liutkus, Antoine
2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 482 - 489
[3] Lip landmark-based audio-visual speech enhancement with multimodal feature fusion network
Li, Yangke
Zhang, Xinman
NEUROCOMPUTING, 2023, 549
[4] A Small Object Detection Network Based on Multiple Feature Enhancement and Feature Fusion
Tan K.
Ding S.
Wu S.
Tian K.
Ren J.
Scientific Programming, 2023, 2023
[5] Fractional feature-based speech enhancement with deep neural network
Xu, Liyun
Zhang, Tong
SPEECH COMMUNICATION, 2023, 153
[6] Improved Transformer-Based Dual-Path Network with Amplitude and Complex Domain Feature Fusion for Speech Enhancement
Ye, Moujia
Wan, Hongjie
ENTROPY, 2023, 25 (02)
[7] Convolutional fusion network for monaural speech enhancement
Xian, Yang
Sun, Yang
Wang, Wenwu
Naqvi, Syed Mohsen
NEURAL NETWORKS, 2021, 143 : 97 - 107
[8] Underwater Image Enhancement Method Based on Feature Fusion Neural Network
Tian, Yuan
Xu, Yuang
Zhou, Jun
IEEE ACCESS, 2022, 10 : 107536 - 107548
[9] CNN-Based Feature Integration Network for Speech Enhancement in Microphone Arrays
Xi, Ji
Jiang, Pengxu
Xie, Yue
Jiang, Wei
Ding, Hao
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107 (12) : 1546 - 1549
[10] Crowd counting network based on attention feature fusion and multi-column feature enhancement
Liu, Qian
Zhong, Yixiong
Fang, Jiongtao
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 105

← 1 2 3 4 5 →