A Front-End Speech Enhancement System for Robust Automotive Speech Recognition

被引：0

作者：

Wang, Haikun ^{[1
]}

Ye, Zhongfu ^{[1
]}

Chen, Jingdong ^{[2
]}

机构：

[1] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, Hefei 230027, Anhui, Peoples R China

[2] Northwestern Polytech Univ, Ctr Intelligent Acoust & Immers Commun, Xian 710072, Shaanxi, Peoples R China

来源：

2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2018年

关键词：

Speech enhancement; model-based; voice activity detection; microphone array; relative transfer function estimation; Generalized sidelobe cancellation; speech recognition; VOICE ACTIVITY DETECTION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a front-end speech enhancement approach to robust speech recognition in automotive environments. It combines model-based voice activity detection (VAD), relative transfer function (RTF) based generalized sidelobe cancelation, and single-channel post filtering to enhance the speech signal of interest, thereby improving the robustness of speech recognition. First, we choose four typical driving scenarios, which include most of the noise types in automobiles to record training data. The recorded data are then used to train Gaussian mixture models (GMMs) for both speech and noise. The trained GMMs are subsequently used to estimate the speech presence probability on a frame-by-frame basis. This speech presence probability is then served as the basic information for RTF estimation, adaptive beamforming, and post-filtering. Experiments are conducted in real automotive environments and the results show that the developed method can significantly improve the performance of both VAD and automatic speech recognition (ASR).

引用

页码：1 / 5

页数：5

共 50 条

[21] Investigation into a Mel subspace based front-end processing for robust speech recognition
Selouani, SA
O'Shaughnessy, D
Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004, : 187 - 190
[22] Feature enhancement for a bitstream-based front-end in wireless speech recognition
Kim, HK
Cox, RV
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 241 - 244
[23] SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS AS A FRONT END FOR ROBUST SPEECH RECOGNISER
Lena, D. S. K.
Vijayalakshmi, P.
2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2017, : 430 - 435
[24] Automatic Speech Recognition with a Cochlear Implant Front-End
Nogueira, Waldo
Harczos, Tamas
Edler, Bernd
Ostermann, Joern
Buechner, Andreas
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1993 - +
[25] A Front-End Technique for Automatic Noisy Speech Recognition
Naing, Hay Mar Soe
Hidayat, Risanuri
Hartanto, Rudy
Miyanaga, Yoshikazu
PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 49 - 54
[26] Using Twin-HMM-Based Audio-Visual Speech Enhancement as a Front-End for Robust Audio-Visual Speech Recognition
Abdelaziz, Ahmed Hussen
Zeiler, Steffen
Kolossa, Dorothea
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 867 - 871
[27] A Unified Front-end Anti-interference Approach for Robust Automatic Speech Recognition
Liang, Yunming
Zhou, Yi
Ma, Yongbao
Liu, Hongqing
2019 IEEE 19TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2019), 2019,
[28] Efficient Noise-Robust Speech Recognition Front-End Based on the ETSI Standard
Neves, Claudio
Veiga, Arlindo
Sa, Luis
Perdigao, Fernando
ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 609 - 612
[29] A new approach to variable frame rate front-end processing for robust speech recognition
Epps, J
ISSPA 2005: The 8th International Symposium on Signal Processing and its Applications, Vols 1 and 2, Proceedings, 2005, : 723 - 726
[30] Front-end speech enhancement for commercial speaker verification systems
Eskimez, Sefik Emre
Soufleris, Peter
Duan, Zhiyao
Heinzelman, Wendi
SPEECH COMMUNICATION, 2018, 99 : 101 - 113

← 1 2 3 4 5 →