An In-Car Speech Recognition System for Disabled Drivers

被引：0

作者：

Ivanecky, Jozef ^{[1
]}

Mehlhase, Stephan ^{[1
]}

机构：

[1] European Media Lab, D-69118 Heidelberg, Germany

来源：

TEXT, SPEECH AND DIALOGUE, TSD 2012 | 2012年 / 7499卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automatic Speech Recognition (ASR) is becoming a standard in nowadays cars. However, ASR in cars is usually restricted to activities not directly influencing the driving process. Thus, the voice-controlled functions can rather be classified as comfort functions, e. g. controlling the air condition, the navigation and entertainment system or even the mobile phone of the driver. Obviously this usage of an ASR system could be extended in two directions: On the one side, the speech recognition system could be used to control secondary functions in the car like lights, windscreen wipers or windows. On the other side, the comfort functions could be enriched by utilizing services like weather inquiries, SMS dictation or online traffic information. Compared to todays usage these extensions require a different approach than the one employed today. Controlling secondary functions in the car by voice demands the usage of a very reliable, real-time, local ASR. At the same time a large vocabulary ASR system is required for comfort functions like dictation of messages. In this paper, we describe our efforts towards a hybrid speech recognition system to control secondary functions in the car. We also provide an extended comfort functionality to the driver. The hybrid speech recognition system contains a fast, grammar-based, embedded recognizer and a remote, server-based, LM-based, large vocabulary ASR system. We will analyze different aspects of such a design and the integration of it into a car. The main focus of the paper will be on maximizing the reliability of the embedded recognizer and designing an algorithm for switching dynamically between the embedded recognizer and the server-based ASR system.

引用

页码：505 / 512

页数：8

共 50 条

[41] SPEECH RECOGNITION AND CONTROL-SYSTEM FOR THE SEVERELY DISABLED
COHEN, A
GRAUPE, D
JOURNAL OF BIOMEDICAL ENGINEERING, 1980, 2 (02): : 97 - 107
[42] A Flexible Speech Recognition System for Cerebral Palsy Disabled
Jamil, Mohd Hafidz Mohamad
Al-Haddad, S. A. R.
Ng, Chee Kyun
INFORMATICS ENGINEERING AND INFORMATION SCIENCE, PT I, 2011, 251 : 42 - 55
[43] Robust Log-Energy Estimation and its Dynamic Change Enhancement for In-car Speech Recognition
Li, Weifeng
Wang, Longbiao
Zhou, Yicong
Bourlard, Herv
Liao, Qingmin
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (08): : 1689 - 1698
[44] On the joint use of noise reduction and MLLR adaptation for in-car hands-free speech recognition
Matassoni, M
Omologo, M
Santarelli, A
Svaizer, P
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 289 - 292
[45] Automatic music genre recognition for in-car infotainment
Jakubec, Maros
Chmulik, Michal
13TH INTERNATIONAL SCIENTIFIC CONFERENCE ON SUSTAINABLE, MODERN AND SAFE TRANSPORT (TRANSCOM 2019), 2019, 40 : 1364 - 1371
[46] Improved traffic sign recognition for in-car cameras
Lin, Huei-Yung
Chang, Chin-Chen
Van Luan Tran
Shi, Jian-He
JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2020, 43 (03) : 300 - 307
[47] Decision fusion techniques for in-car driver recognition
Eskil, Taner
Erdogan, Hakan
Ercil, Aytul
Ozyagci, Ali Nazmi
Rodoper, Mete
2006 IEEE 14TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS, VOLS 1 AND 2, 2006, : 738 - +
[48] Characterizing in-car conversational speech of different dialogue modes
Fujimura, Hiroshi
Miyajima, Chiyomi
Kawaguchi, Nobuo
Itou, Katsunobu
Takeda, Kazuya
Itakura, Fumitada
ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 2, PROCEEDINGS, 2006, : 552 - +
[49] CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition
Dai, Wenliang
Cahyawijaya, Samuel
Yu, Tiezheng
Barezi, Elham J.
Xu, Peng
Yiu, Cheuk Tung Shadow
Frieske, Rita
Lovenia, Holy
Winata, Genta Indra
Chen, Qifeng
Ma, Xiaojuan
Shi, Bertram E.
Fung, Pascale
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6786 - 6793
[50] Adaptive log-spectral regression for in-car speech recognition using multiple distributed microphones
Li, WF
Takeda, K
Itakura, F
IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (04) : 340 - 343

← 1 2 3 4 5 →