Multichannel Linear Prediction-Based Speech Dereverberation Considering Sparse and Low-Rank Priors

被引:1
|
作者
Wang, Taihui [1 ,2 ]
Yang, Feiran [3 ]
Yang, Jun [1 ,2 ]
机构
[1] Chinese Acad Sci, Key Lab Noise & Vibrat Res, Inst Acoust, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Chinese Acad Sci, State Key Lab Acoust, Inst Acoust, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Speech processing; Reverberation; Microphones; Time-frequency analysis; Shape; Indexes; Cost function; Speech dereverberation; multichannel linear prediction; weighted prediction error; complex generalized Gaussian; nonnegative matrix factorization; TIME; REVERBERATION; MASKING; DOMAIN; SYSTEM; NOISE;
D O I
10.1109/TASLP.2024.3369535
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This article addresses the multi-channel linear prediction (MCLP)-based speech dereverberation problem by jointly considering the sparsity and low-rank priors of speech spectrograms. We utilize the complex generalized Gaussian (CGG) distribution as the source model and the generalized nonnegative matrix factorization (NMF) as the spectral model. The difference between the presented model and existing ones for MCLP is twofold. First, we adopt the CGG distribution with a time-frequency-variant scale parameter instead of that with a time-frequency-invariant scale parameter. Second, the time-frequency-varying scale parameter is approximated by NMF in a low-rank manner. Based on the maximum-likelihood criterion, speech dereverberation is formulated as an optimization problem that minimizes the prediction error weighted by the reciprocal of sparse and low-rank parameters. A convergence-guaranteed algorithm is derived to estimate the parameters using the majorization-minimization technology. The WPE, NMF-based WPE and CGG-based WPE can be treated as special cases of the proposed method with different shape and domain parameters. As a byproduct, the proposed method provides a simple and elegant way to derive the CGG-based WPE algorithm. A series of experiments show the superiority of the proposed method over WPE, NMF-based WPE and CGG-based WPE methods.
引用
收藏
页码:1724 / 1735
页数:12
相关论文
共 50 条
  • [1] Multi-Channel Linear Prediction-Based Speech Dereverberation With Sparse Priors
    Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg
    26111, Germany
    不详
    3000, Belgium
    IEEE Trans. Audio Speech Lang. Process., 9 (1509-1520):
  • [2] Multi-Channel Linear Prediction-Based Speech Dereverberation With Sparse Priors
    Jukic, Ante
    van Waterschoot, Toon
    Gerkmann, Timo
    Doclo, Simon
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (09) : 1509 - 1520
  • [3] MULTI-CHANNEL LINEAR PREDICTION-BASED SPEECH DEREVERBERATION WITH LOW-RANK POWER SPECTROGRAM APPROXIMATION
    Jukic, Ante
    Mohammadiha, Nasser
    van Waterschoort, Toon
    Gerkmann, Timo
    Doclo, Simon
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 96 - 100
  • [4] Adaptive Speech Dereverberation Using Constrained Sparse Multichannel Linear Prediction
    Jukic, Ante
    van Waterschoot, Toon
    Doclo, Simon
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (01) : 101 - 105
  • [5] SPEECH DEREVERBERATION WITH MULTI-CHANNEL LINEAR PREDICTION AND SPARSE PRIORS FOR THE DESIRED SIGNAL
    Jukic, Ante
    van Waterschoot, Toon
    Gerkmann, Timo
    Doclo, Simon
    2014 4TH JOINT WORKSHOP ON HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS (HSCMA), 2014, : 23 - 26
  • [6] Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms
    Bando, Yoshiaki
    Itoyama, Katsutoshi
    Konyo, Masashi
    Tadokoro, Satoshi
    Nakadai, Kazuhiro
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    Okuno, Hiroshi G.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (02) : 215 - 230
  • [7] Online Speech Dereverberation Algorithm Based on Adaptive Multichannel Linear Prediction
    Yang, Jae-Mo
    Kang, Hong-Goo
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (03) : 608 - 619
  • [8] A tensor decomposition based multichannel linear prediction approach to speech dereverberation
    Zeng, Xiaojin
    He, Hongsen
    Chen, Jingdong
    Benesty, Jacob
    APPLIED ACOUSTICS, 2023, 214
  • [9] Sparse Linear Prediction-based Dereverberation for Signal Enhancement in Distant Speaker Verification
    Witkowski, Marcin
    Rybicka, Magdalena
    Kowalczyk, Konrad
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 461 - 465
  • [10] Image Restoration: From Sparse and Low-Rank Priors to Deep Priors
    Zhang, Lei
    Zuo, Wangmeng
    IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (05) : 172 - 179