Multilingual phone recognition of spontaneous telephone speech

被引:0
|
作者
Corredor-Ardoy, C [1 ]
Lamel, L [1 ]
Adda-Decker, M [1 ]
Gauvain, JL [1 ]
机构
[1] BOUYGUES TELECOM, F-78944 Velizy, France
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we report on experiments with phone recognition of spontaneous telephone speech. Phone recognizers were trained and assessed on IDEAL, a multilingual corpus containing telephone speech in French, British English, German and Castillan Spanish. We investigated the influence of the training material composition (size and linguistic content) on the recognition performance using context-independent Hidden Markov Models and phonotactic bi-gram models. We found that when testing on spontaneous speech data, using only spontaneous speech training data gave the highest phone accuracies for the four languages, even though this data comprises only 14% of the available training data. The use of context-dependent HMMs reduced the phone error across the 4 languages, with the average error reduced to 51.9% from the 57.4% obtained with CZ models. We suggest a straightforward way of detecting non speech phenomena. The basic idea is to remove sequences of consonants between two silence labels from the recognized phone strings prior to scoring. This simple technique reduces the relative average phone error rate by 5.4%. The lowest phone error with CD models and filtering was obtained for Spanish (39.1%) with 4 language average being 49.1%.
引用
收藏
页码:413 / 416
页数:4
相关论文
共 50 条
  • [31] MIXTURE OF INFORMED EXPERTS FOR MULTILINGUAL SPEECH RECOGNITION
    Gaur, Neeraj
    Farris, Brian
    Haghani, Parisa
    Leal, Isabel
    Moreno, Pedro J.
    Prasad, Manasa
    Ramabhadran, Bhuvana
    Zhu, Yun
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6234 - 6238
  • [32] A Survey of Multilingual Models for Automatic Speech Recognition
    Yadav, Hemant
    Sitaram, Sunayana
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 5071 - 5079
  • [33] Adaptive multilingual speech recognition with pretrained models
    Ngoc-Quan Pham
    Waibel, Alex
    Niehues, Jan
    INTERSPEECH 2022, 2022, : 3879 - 3883
  • [34] Multilingual Speech Recognition with Corpus Relatedness Sampling
    Li, Xinjian
    Dalmia, Siddharth
    Black, Alan W.
    Metze, Florian
    INTERSPEECH 2019, 2019, : 2120 - 2124
  • [35] Multilingual acoustic models for speech recognition and synthesis
    Kunzmann, S
    Fischer, V
    Gonzalez, J
    Emam, O
    Günther, C
    Janke, E
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 745 - 748
  • [36] Efficient Weight factorization for Multilingual Speech Recognition
    Ngoc-Quan Pham
    Tuan-Nam Nguyen
    Stuker, Sebastian
    Waibel, Alex
    INTERSPEECH 2021, 2021, : 2421 - 2425
  • [37] Language Adaptive Multilingual CTC Speech Recognition
    Mueller, Markus
    Stueker, Sebastian
    Waibel, Alex
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 473 - 482
  • [38] Towards multilingual interoperability in automatic speech recognition
    Adda-Decker, M
    SPEECH COMMUNICATION, 2001, 35 (1-2) : 5 - 20
  • [39] Separability and recognition of emotion states in multilingual speech
    Jiang, XQ
    Tian, L
    Han, M
    2005 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, VOLS 1 AND 2, PROCEEDINGS: VOL 1: COMMUNICATION THEORY AND SYSTEMS, 2005, : 861 - 864
  • [40] Robust speech detection method for telephone speech recognition system
    Kuroiwa, S
    Naito, M
    Yamamoto, S
    Higuchi, N
    SPEECH COMMUNICATION, 1999, 27 (02) : 135 - 148