Multichannel Linear Prediction-Based Speech Dereverberation Considering Sparse and Low-Rank Priors

被引:1
|
作者
Wang, Taihui [1 ,2 ]
Yang, Feiran [3 ]
Yang, Jun [1 ,2 ]
机构
[1] Chinese Acad Sci, Key Lab Noise & Vibrat Res, Inst Acoust, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Chinese Acad Sci, State Key Lab Acoust, Inst Acoust, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Speech processing; Reverberation; Microphones; Time-frequency analysis; Shape; Indexes; Cost function; Speech dereverberation; multichannel linear prediction; weighted prediction error; complex generalized Gaussian; nonnegative matrix factorization; TIME; REVERBERATION; MASKING; DOMAIN; SYSTEM; NOISE;
D O I
10.1109/TASLP.2024.3369535
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This article addresses the multi-channel linear prediction (MCLP)-based speech dereverberation problem by jointly considering the sparsity and low-rank priors of speech spectrograms. We utilize the complex generalized Gaussian (CGG) distribution as the source model and the generalized nonnegative matrix factorization (NMF) as the spectral model. The difference between the presented model and existing ones for MCLP is twofold. First, we adopt the CGG distribution with a time-frequency-variant scale parameter instead of that with a time-frequency-invariant scale parameter. Second, the time-frequency-varying scale parameter is approximated by NMF in a low-rank manner. Based on the maximum-likelihood criterion, speech dereverberation is formulated as an optimization problem that minimizes the prediction error weighted by the reciprocal of sparse and low-rank parameters. A convergence-guaranteed algorithm is derived to estimate the parameters using the majorization-minimization technology. The WPE, NMF-based WPE and CGG-based WPE can be treated as special cases of the proposed method with different shape and domain parameters. As a byproduct, the proposed method provides a simple and elegant way to derive the CGG-based WPE algorithm. A series of experiments show the superiority of the proposed method over WPE, NMF-based WPE and CGG-based WPE methods.
引用
收藏
页码:1724 / 1735
页数:12
相关论文
共 50 条
  • [21] SPEECH ENHANCEMENT BY SPARSE, LOW-RANK, AND DICTIONARY SPECTROGRAM DECOMPOSITION
    Chen, Zhuo
    Ellis, Daniel P. W.
    2013 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2013,
  • [22] Kronecker Product Multichannel Linear Filtering for Adaptive Weighted Prediction Error-Based Speech Dereverberation
    Huang, Gongping
    Benesty, Jacob
    Cohen, Israel
    Chen, Jingdong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1277 - 1289
  • [23] Pansharpening Based on Low-Rank and Sparse Decomposition
    Rong, Kaixuan
    Jiao, Licheng
    Wang, Shuang
    Liu, Fang
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2014, 7 (12) : 4793 - 4805
  • [24] Deep Neural Network Based Monaural Speech Enhancement with Sparse and Low-Rank Decomposition
    Shi, Wenhua
    Zhang, Xiongwei
    Sun, Meng
    Zou, Xia
    Wei, Yanmin
    Min, Gang
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT 2017), 2017, : 1644 - 1647
  • [25] A novel speech enhancement method based on constrained low-rank and sparse matrix decomposition
    Sun, Chengli
    Zhu, Qi
    Wan, Minghua
    SPEECH COMMUNICATION, 2014, 60 : 44 - 55
  • [26] Multicolor low-rank preconditioner for general sparse linear systems
    Zheng, Qingqing
    Xi, Yuanzhe
    Saad, Yousef
    NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 2020, 27 (04)
  • [27] Dynamic Low-Rank and Sparse Priors Constrained Deep Autoencoders for Hyperspectral Anomaly Detection
    Lin, Sheng
    Zhang, Min
    Cheng, Xi
    Shi, Lei
    Gamba, Paolo
    Wang, Hai
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 18
  • [28] Blind speech dereverberation using sparse decomposition and multi-channel linear prediction
    Mousavi, Leila
    Razzazi, Farbod
    Haghbin, Afrooz
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (03) : 729 - 738
  • [29] Blind speech dereverberation using sparse decomposition and multi-channel linear prediction
    Leila Mousavi
    Farbod Razzazi
    Afrooz Haghbin
    International Journal of Speech Technology, 2019, 22 : 729 - 738
  • [30] SPIKE SORTING BASED ON LOW-RANK AND SPARSE REPRESENTATION
    Huang, Libo
    Ling, Bingo Wing-Kuen
    Zeng, Yan
    Gan, Lu
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,