Jointly Gaussian PDF-Based Likelihood Ratio Test for Voice Activity Detection

被引：16

作者：

Manuel Gorriz, Juan ^{[1
]}

Ramirez, Javier ^{[1
]}

Lang, Elmar W. ^{[2
]}

Puntonet, Arlos G. ^{[3
]}

机构：

[1] Univ Granada, Dept Signal Theory Networking & Commun, E-18071 Granada, Spain

[2] Univ Regensburg, Inst Biophys & Phys Biochem, D-93040 Regensburg, Germany

[3] Univ Granada, Dept Comp Architecture & Technol, E-18071 Granada, Spain

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2008年 / 16卷 / 08期

关键词：

Generalized complex Gaussian (GCG) probability distribution function; robust speech recognition; voice activity detection (VAD);

D O I：

10.1109/TASL.2008.2004293

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a novel voice activity detector (VAD) for improving speech detection robustness in noisy environments and the performance of speech recognition systems in real-time applications. The algorithm is based on a generalized complex Gaussian (GCG) observation model and defines an optimal likelihood ratio test (LRT) involving multiple and correlated observations (MCO) based on jointly Gaussian probability distribution functions (jGpdf). An extensive analysis of the proposed methodology for a low dimensional observation model demonstrates 1) the improved robustness of the proposed approach by means of a clear reduction of the classification error as the number of observations is increased, and 2) the tradeoff between the number of observations and the detection performance. The proposed strategy is also compared to different VAD methods including the G.729, AMR, and AFE standards, as well as other recently reported algorithms showing a sustained advantage in speech/nonspeech detection accuracy and speech recognition performance.

引用

页码：1565 / 1578

页数：14

共 50 条

[1] Effective jointly pdf-based voice activity detector for real-time applications
Gorriz, J. M.
Ramirez, J.
Puntonet, C. G.
ELECTRONICS LETTERS, 2007, 43 (04) : 251 - 253
[2] Improved voice activity detection based on statistical likelihood ratio test
School of Information Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
J. Harbin Inst. Technol., 2007, SUPPL. 2 (64-67):
[3] Likelihood ratio sign test for voice activity detection
Deng, S.
Han, J.
IET SIGNAL PROCESSING, 2012, 6 (04) : 306 - 312
[4] Voice activity detection based on conjugate subspace matching pursuit and likelihood ratio test
Shiwen Deng
Jiqing Han
EURASIP Journal on Audio, Speech, and Music Processing, 2011
[5] Voice activity detection based on conjugate subspace matching pursuit and likelihood ratio test
Deng, Shiwen
Han, Jiqing
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2011, : 1 - 12
[6] Towards Robust Detection of PDF-based Malware
Tay, Kai Yuan
Chua, Shawn
Chua, Melissa
Balachandran, Vivek
CODASPY'22: PROCEEDINGS OF THE TWELVETH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, 2022, : 370 - 372
[7] Statistical voice activity detection using a multiple observation likelihood ratio test
Ramírez, J
Segura, JC
Benítez, C
García, L
Rubio, A
IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (10) : 689 - 692
[8] Robust Statistical Voice Activity Detection Using a Likelihood Ratio Sign Test
Deng, Shiwen
Han, Jiqing
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 3126 - 3129
[9] Polishing the Classical Likelihood Ratio Test by Supervised Learning for Voice Activity Detection
Xu, Tianjiao
Zhang, Hui
Zhang, Xueliang
INTERSPEECH 2020, 2020, : 3675 - 3679
[10] VOICE ACTIVITY DETECTION USING HARMONIC FREQUENCY COMPONENTS IN LIKELIHOOD RATIO TEST
Lee Ngee Tan
Borgstrom, Bengt J.
Alwan, Abeer
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4466 - 4469

← 1 2 3 4 5 →