Robust Voice Activity Detector for Real World Applications Using Harmonicity and Modulation frequency

被引:0
|
作者
Chuangsuwanich, Ekapol [1 ]
Glass, James [1 ]
机构
[1] MIT Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
关键词
voice activity detection; modulation frequency; harmonicity; human-robot interaction; SPEECH;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The task of robustly detecting distant speech in low SNR environments for automatic speech recognition is examined using a two-stage approach based on two distinguishing features of speech, namely harmonicity and modulation frequency (MF). A modified metric for harmonicity is used as a gating function to a set of parallel classifiers that incorporate MFs computed on different frequency bands. Performance is evaluated on both the frame-level discriminative power and also the system level ASR results on a real-world robotic forklift task. Compared to other previously proposed features such as relative spectral entropy, and classification strategies involving MFs, the combined approach shows good generalization across different kinds of dynamic noise conditions, and obtains a significant improvement on the false alarm rate at low speech miss rate settings. The overall ASR results also improved significantly compared to the ESTI AMR-VAD2, while reducing the number of false alarms by a factor of two.
引用
收藏
页码:2656 / 2659
页数:4
相关论文
共 50 条
  • [21] Robust Voice Activity Detector by Combining Sequentially Trained Deep Neural Networks
    Nahar, S. M. Raufun
    Kai, Atsuhiko
    2016 INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS - CONCEPTS, THEORY AND APPLICATION (ICAICTA), 2016,
  • [22] A robust polynomial regression-based voice activity detector for speaker verification
    Disken, Gokay
    Tufekci, Zekeriya
    Cevik, Ulus
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2017,
  • [23] A Robust Mel-Scale Subband Voice Activity Detector for a Car Platform
    Alvarez, A.
    Martinez, R.
    Gomez, P.
    Nieto, V.
    Rodellar, V.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 825 - 828
  • [24] A robust polynomial regression-based voice activity detector for speaker verification
    Gökay Dişken
    Zekeriya Tüfekci
    Ulus Çevik
    EURASIP Journal on Audio, Speech, and Music Processing, 2017
  • [25] Robust Voice Activity Detection Using Feature Combination
    Haghani, Sahar Khaksar
    Ahadi, Seyed Mohammad
    2013 21ST IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2013,
  • [26] A voice activity detector using the chi-square test
    Ahmed, B
    Holmes, WH
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 625 - 628
  • [27] Robust F0 estimation of speech signal using harmonicity measure based on instantaneous frequency
    Arifianto, D
    Tanaka, T
    Masuko, T
    Kobayashi, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (12) : 2812 - 2820
  • [28] ChartStamp: Robust Chart Embedding for Real-World Applications
    Fu, Jiayun
    Zhu, Bin B.
    Zhang, Haidong
    Zou, Yayi
    Ge, Song
    Cui, Weiwei
    Wang, Yun
    Zhang, Dongmei
    Ma, Xiaojing
    Jin, Hai
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 2786 - 2795
  • [29] Real-Time Implementation of Voice Activity Detector on ARM Embedded Processor of Smartphones
    Sehgal, Abhishek
    Saki, Fatemeh
    Kehtarnavaz, Nasser
    2017 IEEE 26TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2017, : 1285 - 1290
  • [30] Robust voice activity detection using group delay functions
    Krishnan, Sree Hari P.
    Padmanabhan, R.
    Murthy, Heina A.
    2006 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY, VOLS 1-6, 2006, : 1704 - +