Jointly Gaussian PDF-Based Likelihood Ratio Test for Voice Activity Detection

被引:16
|
作者
Manuel Gorriz, Juan [1 ]
Ramirez, Javier [1 ]
Lang, Elmar W. [2 ]
Puntonet, Arlos G. [3 ]
机构
[1] Univ Granada, Dept Signal Theory Networking & Commun, E-18071 Granada, Spain
[2] Univ Regensburg, Inst Biophys & Phys Biochem, D-93040 Regensburg, Germany
[3] Univ Granada, Dept Comp Architecture & Technol, E-18071 Granada, Spain
关键词
Generalized complex Gaussian (GCG) probability distribution function; robust speech recognition; voice activity detection (VAD);
D O I
10.1109/TASL.2008.2004293
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel voice activity detector (VAD) for improving speech detection robustness in noisy environments and the performance of speech recognition systems in real-time applications. The algorithm is based on a generalized complex Gaussian (GCG) observation model and defines an optimal likelihood ratio test (LRT) involving multiple and correlated observations (MCO) based on jointly Gaussian probability distribution functions (jGpdf). An extensive analysis of the proposed methodology for a low dimensional observation model demonstrates 1) the improved robustness of the proposed approach by means of a clear reduction of the classification error as the number of observations is increased, and 2) the tradeoff between the number of observations and the detection performance. The proposed strategy is also compared to different VAD methods including the G.729, AMR, and AFE standards, as well as other recently reported algorithms showing a sustained advantage in speech/nonspeech detection accuracy and speech recognition performance.
引用
收藏
页码:1565 / 1578
页数:14
相关论文
共 50 条
  • [1] Effective jointly pdf-based voice activity detector for real-time applications
    Gorriz, J. M.
    Ramirez, J.
    Puntonet, C. G.
    ELECTRONICS LETTERS, 2007, 43 (04) : 251 - 253
  • [2] Improved voice activity detection based on statistical likelihood ratio test
    School of Information Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
    J. Harbin Inst. Technol., 2007, SUPPL. 2 (64-67):
  • [3] Likelihood ratio sign test for voice activity detection
    Deng, S.
    Han, J.
    IET SIGNAL PROCESSING, 2012, 6 (04) : 306 - 312
  • [4] Voice activity detection based on conjugate subspace matching pursuit and likelihood ratio test
    Shiwen Deng
    Jiqing Han
    EURASIP Journal on Audio, Speech, and Music Processing, 2011
  • [5] Voice activity detection based on conjugate subspace matching pursuit and likelihood ratio test
    Deng, Shiwen
    Han, Jiqing
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2011, : 1 - 12
  • [6] Towards Robust Detection of PDF-based Malware
    Tay, Kai Yuan
    Chua, Shawn
    Chua, Melissa
    Balachandran, Vivek
    CODASPY'22: PROCEEDINGS OF THE TWELVETH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, 2022, : 370 - 372
  • [7] Statistical voice activity detection using a multiple observation likelihood ratio test
    Ramírez, J
    Segura, JC
    Benítez, C
    García, L
    Rubio, A
    IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (10) : 689 - 692
  • [8] Robust Statistical Voice Activity Detection Using a Likelihood Ratio Sign Test
    Deng, Shiwen
    Han, Jiqing
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 3126 - 3129
  • [9] Polishing the Classical Likelihood Ratio Test by Supervised Learning for Voice Activity Detection
    Xu, Tianjiao
    Zhang, Hui
    Zhang, Xueliang
    INTERSPEECH 2020, 2020, : 3675 - 3679
  • [10] VOICE ACTIVITY DETECTION USING HARMONIC FREQUENCY COMPONENTS IN LIKELIHOOD RATIO TEST
    Lee Ngee Tan
    Borgstrom, Bengt J.
    Alwan, Abeer
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4466 - 4469