Design of a robust MVDR beamforming method with Low-Latency by reconstructing covariance matrix for speech enhancement

被引：2

作者：

Zhou, Jing ^{[1
]}

Bao, Changchun ^{[1
]}

Zhang, Xu ^{[1
]}

Xiong, Wenmeng ^{[1
]}

Jia, Maoshen ^{[1
]}

机构：

[1] Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China

来源：

APPLIED ACOUSTICS | 2023年 / 211卷

基金：

中国国家自然科学基金;

关键词：

Speech enhancement; Microphone array; Minimum variance distortionless response; Beamformer; Direction of arrival; STEERING VECTOR ESTIMATION; PERFORMANCE; ARRAY;

D O I：

10.1016/j.apacoust.2023.109464

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Aiming at solving the problems of the conventional minimum variance distortionless response (MVDR) beamformer in practical applications, such as the sensibility of the steering vector mismatch and beam pattern distortion, a robust broadband MVDR beamforming method with low-latency by reconstructing covariance matrix is proposed and applied to speech enhancement with a linear microphone array in this paper. In this work, some important steps are optimized, and the main contribution is to consider the problem of correlation terms generated by the low latency. Firstly, the direction of arrival (DOA) is cor-rected and the steering vector is estimated based on the sparsity of the DOAs corresponding to the sound sources, which improves the ability of anti-mismatches in the steering vector. Secondly, the correlation terms between the sound sources and noise are estimated and eliminated by the Capon power within the eigen-subspace, and the indirect dominant method is used to eliminate the correlation terms between the sound sources, so that the covariance matrix is reconstructed to obtain a more robust MVDR beam former. Thirdly, the problem of white noise amplification at low frequency bins is analyzed, and a white noise gain (WNG) modification method is proposed to obtain a compromise between the interference suppression and WNG. In the experiments, the TIMIT corpus is used to generate the multi-channel speech data set, and the performance of the proposed method is evaluated with different DOAs and input signal to interference plus noise ratios (SINRs). The experimental results show that the proposed method can effectively suppress the interferences and reduce the noise with strong robustness.& COPY; 2023 Elsevier Ltd. All rights reserved.

引用

页数：16

共 50 条

[21] Iterative autoregression: a novel trick to improve your low-latency speech enhancement model
Andreev, Pavel
Babaev, Nicholas
Shchekotov, Ivan
Saginbaev, Azat
Alanov, Aibek
INTERSPEECH 2023, 2023, : 2448 - 2452
[22] DNN-FREE LOW-LATENCY ADAPTIVE SPEECH ENHANCEMENT BASED ON FRAME-ONLINE BEAMFORMING POWERED BY BLOCK-ONLINE FASTMNMF
Nugraha, Aditya Arie
Sekiguchi, Kouhei
Fontaine, Mathieu
Bando, Yoshiaki
Yoshii, Kazuyoshi
2022 INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC 2022), 2022,
[23] Far Field Speech Enhancement at Low SNR in Presence of Nonstationary Noise Based on Spectral Masking and MVDR Beamforming
Astapov, Sergei
Lavrentyev, Aleksandr
Shuranov, Evgeniy
SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 21 - 31
[24] Low-complexity robust adaptive wideband beamforming based on covariance matrix reconstruction
Liu, Yaqi
Zhao, Yongjun
ELECTRONICS LETTERS, 2018, 54 (16) : 978 - 979
[25] LOW-COST ADAPTIVE MAXIMUM ENTROPY COVARIANCE MATRIX RECONSTRUCTION FOR ROBUST BEAMFORMING
Mohammadzadeh, Saeed
Nascimento, Vitor H.
de Lamare, Rodrigo C.
Kukrer, Osman
2020 54TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2020, : 1462 - 1466
[26] Robust adaptive beamforming via a novel subspace method for interference covariance matrix reconstruction
Yuan, Xiaolei
Gan, Lu
SIGNAL PROCESSING, 2017, 130 : 233 - 242
[27] DEEP LOW-LATENCY JOINT SPEECH TRANSMISSION AND ENHANCEMENT OVER A GAUSSIAN CHANNEL<bold> </bold>
Bokaeil, Mohammad
Jensen, Jesper
Doclo, Simon
Ostergaard, Jan
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 525 - 529
[28] Improving Low-Latency Mono-Channel Speech Enhancement by Compensation Windows in STFT Analysis
Bui, Minh N.
Tran, Dung N.
Koishida, Kazuhito
Tran, Trac D.
Chin, Peter
COMPLEX NETWORKS & THEIR APPLICATIONS XII, VOL 1, COMPLEX NETWORKS 2023, 2024, 1141 : 363 - 373
[29] A Robust GSC Beamforming Method for Speech Enhancement using Linear Microphone Array
Ni, Feng
Zhou, Yi
Liu, Hongqing
2019 IEEE 21ST INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP 2019), 2019,
[30] A Simple RNN Model for Lightweight, Low-compute and Low-latency Multichannel Speech Enhancement in the Time Domain
Pandey, Ashutosh
Tan, Ke
Xu, Buye
INTERSPEECH 2023, 2023, : 2478 - 2482

← 1 2 3 4 5 →