An In-Car Speech Recognition System for Disabled Drivers

被引:0
|
作者
Ivanecky, Jozef [1 ]
Mehlhase, Stephan [1 ]
机构
[1] European Media Lab, D-69118 Heidelberg, Germany
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic Speech Recognition (ASR) is becoming a standard in nowadays cars. However, ASR in cars is usually restricted to activities not directly influencing the driving process. Thus, the voice-controlled functions can rather be classified as comfort functions, e. g. controlling the air condition, the navigation and entertainment system or even the mobile phone of the driver. Obviously this usage of an ASR system could be extended in two directions: On the one side, the speech recognition system could be used to control secondary functions in the car like lights, windscreen wipers or windows. On the other side, the comfort functions could be enriched by utilizing services like weather inquiries, SMS dictation or online traffic information. Compared to todays usage these extensions require a different approach than the one employed today. Controlling secondary functions in the car by voice demands the usage of a very reliable, real-time, local ASR. At the same time a large vocabulary ASR system is required for comfort functions like dictation of messages. In this paper, we describe our efforts towards a hybrid speech recognition system to control secondary functions in the car. We also provide an extended comfort functionality to the driver. The hybrid speech recognition system contains a fast, grammar-based, embedded recognizer and a remote, server-based, LM-based, large vocabulary ASR system. We will analyze different aspects of such a design and the integration of it into a car. The main focus of the paper will be on maximizing the reliability of the embedded recognizer and designing an algorithm for switching dynamically between the embedded recognizer and the server-based ASR system.
引用
收藏
页码:505 / 512
页数:8
相关论文
共 50 条
  • [41] SPEECH RECOGNITION AND CONTROL-SYSTEM FOR THE SEVERELY DISABLED
    COHEN, A
    GRAUPE, D
    JOURNAL OF BIOMEDICAL ENGINEERING, 1980, 2 (02): : 97 - 107
  • [42] A Flexible Speech Recognition System for Cerebral Palsy Disabled
    Jamil, Mohd Hafidz Mohamad
    Al-Haddad, S. A. R.
    Ng, Chee Kyun
    INFORMATICS ENGINEERING AND INFORMATION SCIENCE, PT I, 2011, 251 : 42 - 55
  • [43] Robust Log-Energy Estimation and its Dynamic Change Enhancement for In-car Speech Recognition
    Li, Weifeng
    Wang, Longbiao
    Zhou, Yicong
    Bourlard, Herv
    Liao, Qingmin
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (08): : 1689 - 1698
  • [44] On the joint use of noise reduction and MLLR adaptation for in-car hands-free speech recognition
    Matassoni, M
    Omologo, M
    Santarelli, A
    Svaizer, P
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 289 - 292
  • [45] Automatic music genre recognition for in-car infotainment
    Jakubec, Maros
    Chmulik, Michal
    13TH INTERNATIONAL SCIENTIFIC CONFERENCE ON SUSTAINABLE, MODERN AND SAFE TRANSPORT (TRANSCOM 2019), 2019, 40 : 1364 - 1371
  • [46] Improved traffic sign recognition for in-car cameras
    Lin, Huei-Yung
    Chang, Chin-Chen
    Van Luan Tran
    Shi, Jian-He
    JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2020, 43 (03) : 300 - 307
  • [47] Decision fusion techniques for in-car driver recognition
    Eskil, Taner
    Erdogan, Hakan
    Ercil, Aytul
    Ozyagci, Ali Nazmi
    Rodoper, Mete
    2006 IEEE 14TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS, VOLS 1 AND 2, 2006, : 738 - +
  • [48] Characterizing in-car conversational speech of different dialogue modes
    Fujimura, Hiroshi
    Miyajima, Chiyomi
    Kawaguchi, Nobuo
    Itou, Katsunobu
    Takeda, Kazuya
    Itakura, Fumitada
    ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 2, PROCEEDINGS, 2006, : 552 - +
  • [49] CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition
    Dai, Wenliang
    Cahyawijaya, Samuel
    Yu, Tiezheng
    Barezi, Elham J.
    Xu, Peng
    Yiu, Cheuk Tung Shadow
    Frieske, Rita
    Lovenia, Holy
    Winata, Genta Indra
    Chen, Qifeng
    Ma, Xiaojuan
    Shi, Bertram E.
    Fung, Pascale
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6786 - 6793
  • [50] Adaptive log-spectral regression for in-car speech recognition using multiple distributed microphones
    Li, WF
    Takeda, K
    Itakura, F
    IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (04) : 340 - 343