Robust Voice Activity Detector for Real World Applications Using Harmonicity and Modulation frequency

被引:0
|
作者
Chuangsuwanich, Ekapol [1 ]
Glass, James [1 ]
机构
[1] MIT Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
关键词
voice activity detection; modulation frequency; harmonicity; human-robot interaction; SPEECH;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The task of robustly detecting distant speech in low SNR environments for automatic speech recognition is examined using a two-stage approach based on two distinguishing features of speech, namely harmonicity and modulation frequency (MF). A modified metric for harmonicity is used as a gating function to a set of parallel classifiers that incorporate MFs computed on different frequency bands. Performance is evaluated on both the frame-level discriminative power and also the system level ASR results on a real-world robotic forklift task. Compared to other previously proposed features such as relative spectral entropy, and classification strategies involving MFs, the combined approach shows good generalization across different kinds of dynamic noise conditions, and obtains a significant improvement on the false alarm rate at low speech miss rate settings. The overall ASR results also improved significantly compared to the ESTI AMR-VAD2, while reducing the number of false alarms by a factor of two.
引用
收藏
页码:2656 / 2659
页数:4
相关论文
共 50 条
  • [41] Robust Real-Time Fire Detector Using CNN And LSTM
    Abdali, Al Maamoon Rasool
    Ghani, Rana Fareed
    2019 17TH IEEE STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT (SCORED), 2019, : 204 - 207
  • [42] A Weighted Feature Voting Approach for Robust and Real-Time Voice Activity Detection
    Moattar, Mohammad Hossein
    Homayounpour, Mohammad Mehdi
    ETRI JOURNAL, 2011, 33 (01) : 99 - 109
  • [43] A Robust, Real-Time Voice Activity Detection Algorithm for Embedded Mobile Devices
    Bian Wu
    Xiaolin Ren
    Chongqing Liu
    Yaxin Zhang
    International Journal of Speech Technology, 2005, 8 (2) : 133 - 146
  • [44] A Robust, Real-Time Voice Activity Detection Algorithm for Embedded Mobile Devices
    Wu, Bian
    Ren, Xiaolin
    Liu, Chongqing
    Zhang, Yaxin
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2005, 8 (02) : 133 - 146
  • [45] Robust Voice Activity Detection Using the Spectral Peaks of Vowel Sounds
    Yoo, In-Chul
    Yook, Dongsuk
    ETRI JOURNAL, 2009, 31 (04) : 451 - 453
  • [46] Voice Activity Detector (VAD) Based on Long-Term Mel Frequency Band Features
    Salishev, Sergey
    Barabanov, Andrey
    Kocharov, Daniil
    Skrelin, Pavel
    Moiseev, Mikhail
    TEXT, SPEECH, AND DIALOGUE, 2016, 9924 : 352 - 358
  • [47] An improved noise-robust voice activity detector based on hidden semi-Markov models
    Liang, Yuan
    Liu, Xianglong
    Lou, Yihua
    Shan, Baosong
    PATTERN RECOGNITION LETTERS, 2011, 32 (07) : 1044 - 1053
  • [48] Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments
    Morita, Shota
    Unoki, Masashi
    Lu, Xugang
    Akagi, Masato
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 108 - +
  • [49] Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments
    Morita, Shota
    Unoki, Masashi
    Lu, Xugang
    Akagi, Masato
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2016, 82 (02): : 163 - 173
  • [50] Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments
    Shota Morita
    Masashi Unoki
    Xugang Lu
    Masato Akagi
    Journal of Signal Processing Systems, 2016, 82 : 163 - 173