Factorized MVDR Deep Beamforming for Multi-Channel Speech Enhancement

被引:4
|
作者
Kim, Hansol [1 ]
Kang, Kyeongmuk [1 ]
Shin, Jong Won [1 ]
机构
[1] Gwangju Inst Sci & Technol, Sch Elect Engn & Comp Sci, Gwangju 61005, South Korea
基金
新加坡国家研究基金会;
关键词
Speech enhancement; Estimation; Artificial neural networks; MISO communication; Array signal processing; Deep learning; Microphone arrays; Multi-channel speech enhancement; deep learning-based beamforming; factorized MVDR beamformer; NEURAL-NETWORK; SEPARATION; ATTENTION;
D O I
10.1109/LSP.2022.3200581
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Traditionally, adaptive beamformers such as the minimum-variance distortionless response (MVDR) beamformer and generalized eigenvalue beamformer have been widely used for multi-channel speech enhancement with a single-channel postfilter. Recently, several approaches have been proposed to enhance the signals used to estimate speech and noise spatial covariance matrices (SCMs) and process the outputs of the beamformers using deep neural networks (DNNs). However, the preprocessing of the signals for SCMs estimation may disrupt phase relations among input signals and the time-averages used to estimate speech and noise SCMs may not be optimal for beamformer performance even though the estimated signals are close to the ground truth. In this letter, we propose a deep beamforming approach which estimates factors of the MVDR beamformer using a DNN to circumvent the difficulty of the speech and noise SCM estimation. We formulate the MVDR beamformer as a factorized form related to two complex factors and estimate them using a DNN with a cost function comparing beamformed signal and the original clean speech. Experimental results showed that the proposed factorized MVDR beamformer could mimic the characteristics of the MVDR beamformer with true relative transfer function and noise SCM and outperformed the MVDR beamformer with deep learning-based pre- and post-processing in terms of the perceptual evaluation of speech quality scores.
引用
收藏
页码:1898 / 1902
页数:5
相关论文
共 50 条
  • [41] Single channel speech enhancement using an MVDR filter in the frequency domain
    Sonay Kammi
    International Journal of Speech Technology, 2019, 22 : 383 - 389
  • [42] Single channel speech enhancement using an MVDR filter in the frequency domain
    Kammi, Sonay
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (02) : 383 - 389
  • [43] Multi-Channel Speech Enhancement and Amplitude Modulation Analysis for Noise Robust Automatic Speech Recognition
    Moritz, Niko
    Adiloglu, Kamil
    Anemueller, Joern
    Goetze, Stefan
    Kollmeier, Birger
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 558 - 573
  • [44] A Novel Approach to Multi-Channel Speech Enhancement Based on Graph Neural Networks
    Chau, Hoang Ngoc
    Bui, Tien Dat
    Nguyen, Huu Binh
    Duong, Thanh Thi Hien
    Nguyen, Quoc Cuong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1133 - 1144
  • [45] A multi-channel subband generalized singular value decomposition approach to speech enhancement
    Spriet, A
    Moonen, M
    Wouters, J
    EUROPEAN TRANSACTIONS ON TELECOMMUNICATIONS, 2002, 13 (02): : 149 - 158
  • [46] EXPLORING MULTI-CHANNEL FEATURES FOR DENOISING-AUTOENCODER-BASED SPEECH ENHANCEMENT
    Araki, Shoko
    Hayashi, Tomoki
    Delcroix, Marc
    Fujimoto, Masakiyo
    Takeda, Kazuya
    Nakatani, Tomohiro
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 116 - 120
  • [47] Three-stage hybrid neural beamformer for multi-channel speech enhancement
    Kuang, Kelan
    Yang, Feiran
    Li, Junfeng
    Yang, Jun
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (06): : 3378 - 3389
  • [48] A dynamic multi-channel speech enhancement system for distributed microphones in a car environment
    Timo Matheja
    Markus Buck
    Tim Fingscheidt
    EURASIP Journal on Advances in Signal Processing, 2013
  • [49] A dynamic multi-channel speech enhancement system for distributed microphones in a car environment
    Matheja, Timo
    Buck, Markus
    Fingscheidt, Tim
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2013,
  • [50] A SUPERVISED MULTI-CHANNEL SPEECH ENHANCEMENT ALGORITHM BASED ON BAYESIAN NMF MODEL
    Chung, Hanwook
    Plourde, Eric
    Champagne, Benoit
    2018 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2018), 2018, : 221 - 225