A Front-End Speech Enhancement System for Robust Automotive Speech Recognition

被引:0
|
作者
Wang, Haikun [1 ]
Ye, Zhongfu [1 ]
Chen, Jingdong [2 ]
机构
[1] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, Hefei 230027, Anhui, Peoples R China
[2] Northwestern Polytech Univ, Ctr Intelligent Acoust & Immers Commun, Xian 710072, Shaanxi, Peoples R China
关键词
Speech enhancement; model-based; voice activity detection; microphone array; relative transfer function estimation; Generalized sidelobe cancellation; speech recognition; VOICE ACTIVITY DETECTION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a front-end speech enhancement approach to robust speech recognition in automotive environments. It combines model-based voice activity detection (VAD), relative transfer function (RTF) based generalized sidelobe cancelation, and single-channel post filtering to enhance the speech signal of interest, thereby improving the robustness of speech recognition. First, we choose four typical driving scenarios, which include most of the noise types in automobiles to record training data. The recorded data are then used to train Gaussian mixture models (GMMs) for both speech and noise. The trained GMMs are subsequently used to estimate the speech presence probability on a frame-by-frame basis. This speech presence probability is then served as the basic information for RTF estimation, adaptive beamforming, and post-filtering. Experiments are conducted in real automotive environments and the results show that the developed method can significantly improve the performance of both VAD and automatic speech recognition (ASR).
引用
收藏
页码:1 / 5
页数:5
相关论文
共 50 条
  • [21] Investigation into a Mel subspace based front-end processing for robust speech recognition
    Selouani, SA
    O'Shaughnessy, D
    Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004, : 187 - 190
  • [22] Feature enhancement for a bitstream-based front-end in wireless speech recognition
    Kim, HK
    Cox, RV
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 241 - 244
  • [23] SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS AS A FRONT END FOR ROBUST SPEECH RECOGNISER
    Lena, D. S. K.
    Vijayalakshmi, P.
    2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2017, : 430 - 435
  • [24] Automatic Speech Recognition with a Cochlear Implant Front-End
    Nogueira, Waldo
    Harczos, Tamas
    Edler, Bernd
    Ostermann, Joern
    Buechner, Andreas
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1993 - +
  • [25] A Front-End Technique for Automatic Noisy Speech Recognition
    Naing, Hay Mar Soe
    Hidayat, Risanuri
    Hartanto, Rudy
    Miyanaga, Yoshikazu
    PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 49 - 54
  • [26] Using Twin-HMM-Based Audio-Visual Speech Enhancement as a Front-End for Robust Audio-Visual Speech Recognition
    Abdelaziz, Ahmed Hussen
    Zeiler, Steffen
    Kolossa, Dorothea
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 867 - 871
  • [27] A Unified Front-end Anti-interference Approach for Robust Automatic Speech Recognition
    Liang, Yunming
    Zhou, Yi
    Ma, Yongbao
    Liu, Hongqing
    2019 IEEE 19TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2019), 2019,
  • [28] Efficient Noise-Robust Speech Recognition Front-End Based on the ETSI Standard
    Neves, Claudio
    Veiga, Arlindo
    Sa, Luis
    Perdigao, Fernando
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 609 - 612
  • [29] A new approach to variable frame rate front-end processing for robust speech recognition
    Epps, J
    ISSPA 2005: The 8th International Symposium on Signal Processing and its Applications, Vols 1 and 2, Proceedings, 2005, : 723 - 726
  • [30] Front-end speech enhancement for commercial speaker verification systems
    Eskimez, Sefik Emre
    Soufleris, Peter
    Duan, Zhiyao
    Heinzelman, Wendi
    SPEECH COMMUNICATION, 2018, 99 : 101 - 113