Jointly Gaussian PDF-Based Likelihood Ratio Test for Voice Activity Detection

被引:16
|
作者
Manuel Gorriz, Juan [1 ]
Ramirez, Javier [1 ]
Lang, Elmar W. [2 ]
Puntonet, Arlos G. [3 ]
机构
[1] Univ Granada, Dept Signal Theory Networking & Commun, E-18071 Granada, Spain
[2] Univ Regensburg, Inst Biophys & Phys Biochem, D-93040 Regensburg, Germany
[3] Univ Granada, Dept Comp Architecture & Technol, E-18071 Granada, Spain
关键词
Generalized complex Gaussian (GCG) probability distribution function; robust speech recognition; voice activity detection (VAD);
D O I
10.1109/TASL.2008.2004293
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel voice activity detector (VAD) for improving speech detection robustness in noisy environments and the performance of speech recognition systems in real-time applications. The algorithm is based on a generalized complex Gaussian (GCG) observation model and defines an optimal likelihood ratio test (LRT) involving multiple and correlated observations (MCO) based on jointly Gaussian probability distribution functions (jGpdf). An extensive analysis of the proposed methodology for a low dimensional observation model demonstrates 1) the improved robustness of the proposed approach by means of a clear reduction of the classification error as the number of observations is increased, and 2) the tradeoff between the number of observations and the detection performance. The proposed strategy is also compared to different VAD methods including the G.729, AMR, and AFE standards, as well as other recently reported algorithms showing a sustained advantage in speech/nonspeech detection accuracy and speech recognition performance.
引用
收藏
页码:1565 / 1578
页数:14
相关论文
共 50 条