Noise Robust Voice Activity Detection Using Features Extracted From the Time-Domain Autocorrelation Function

被引:0
|
作者
Ghaemmaghami, Houman [1 ]
Baker, Brendan [1 ]
Vogt, Robbie [1 ]
Sridharan, Sridha [1 ]
机构
[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld 4001, Australia
关键词
voice activity detection; high noise; autocorrelation; zero-crossing rate; time-domain analysis; SPEECH;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a method of voice activity detection (VAD) for high noise scenarios, using a noise robust voiced speech detection feature. The developed method is based on the fusion of two systems. The first system utilises the maximum peak of the normalised time-domain autocorrelation function (MaxPeak). The second system uses a novel combination of cross-correlation and zero-crossing rate of the normalised autocorrelation to approximate a measure of signal pitch and periodicity (CrossCorr) that is hypothesised to be noise robust. The score outputs by the two systems are then merged using weighted sum fusion to create the proposed autocorrelation zero-crossing rate (AZR) VAD. Accuracy of AZR was compared to state-of-the-art and standardised VAD methods and was shown to outperform the best performing system with an average relative improvement of 24.8% in half-total error rate (HTER) on the QUT-NOISE-TIMIT database created using real recordings from high-noise environments.
引用
收藏
页码:3118 / 3121
页数:4
相关论文
共 50 条
  • [41] Robust Voice Activity Detection Using Frequency Domain Long-Term Differential Entropy
    Ghosh, Debayan
    Muralishankar, R.
    Gurugopinath, Sanjeev
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1220 - 1224
  • [42] Robust voice activity detection using higher-order statistics in the LPC residual domain
    Nemer, E
    Goubran, R
    Mahmoud, S
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 217 - 231
  • [43] Efficient Method for Modeling of SSN Using Time-Domain Impedance Function and Noise Suppression Analysis
    Ding, Tong-Hao
    Li, Yu-Shan
    IEEE TRANSACTIONS ON COMPONENTS PACKAGING AND MANUFACTURING TECHNOLOGY, 2012, 2 (03): : 510 - 520
  • [44] Robust Voice Activity Detection Using Feature Combination
    Haghani, Sahar Khaksar
    Ahadi, Seyed Mohammad
    2013 21ST IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2013,
  • [45] Noise reduction in analysis of dielectric response function by time-domain reflectometry
    Artacho, JM
    Forniés-Marquina, JM
    García, M
    Letosa, J
    15EME COLLOQUE INTERNATIONAL OPTIQUE HERTZIENNE ET DIELECTRIQUES, OHD'99, 1999, : D1 - D4
  • [46] Classification of radar clutter using features extracted from the time-frequency domain
    Jouny, I
    Wu, C
    AUTOMATIC TARGET RECOGNITION VII, 1997, 3069 : 49 - 60
  • [47] Voice Activity Detection Using Entropy in Spectrum Domain
    Asgari, Meysam
    Sayadian, Abolghasem
    Farhadloo, Mohsen
    Mehrizi, Elahe Abouie
    ATNAC: 2008 AUSTRALASIAN TELECOMMUNICATION NETWOKS AND APPLICATIONS CONFERENCE, 2008, : 407 - +
  • [48] Linearly filtered estimation of the time-domain Green's function from measurements of ambient noise
    Albahrani, S. A.
    Frater, M. R.
    Huntington, E. H.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 124 (05): : 2699 - 2701
  • [49] Detection and Classification of Noise Using Bark Domain Features
    Mohdiwale, Samrudhi
    Sahu, Tirath Prasad
    Chaurasia, Rahul K.
    Nagwani, Naresh Kumar
    Verma, Shrish
    PROCEEDINGS OF 2018 6TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND BROADBAND NETWORKING (ICCBN 2018), 2018, : 18 - 21
  • [50] Detection of vocal fold paralysis and oedema using time-domain features and Probabilistic Neural Network
    Hariharan, M.
    Paulraj, M. P.
    Yaacob, Sazali
    INTERNATIONAL JOURNAL OF BIOMEDICAL ENGINEERING AND TECHNOLOGY, 2011, 6 (01) : 46 - 57