Turbo Processing for Speech Recognition

被引:2
|
作者
Moon, Todd K. [1 ,2 ]
Gunther, Jacob H. [1 ,2 ]
Broadus, Cortnie [3 ]
Hou, Wendy [4 ]
Nelson, Nils [3 ]
机构
[1] Utah State Univ, Informat Dynam Lab, Logan, UT 84322 USA
[2] Utah State Univ, Dept Elect & Comp Engn, Logan, UT 84322 USA
[3] Utah State Univ, Dept Math, Logan, UT 84322 USA
[4] Yale Univ, Dept Math, New Haven, CT 06511 USA
关键词
Human-machine interface; speech processing; turbo processing; HIDDEN MARKOV-MODELS; MAXIMIZATION;
D O I
10.1109/TCYB.2013.2247593
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech recognition is a classic example of a human/machine interface, typifying many of the difficulties and opportunities of human/machine interaction. In this paper, speech recognition is used as an example of applying turbo processing principles to the general problem of human/machine interface. Speech recognizers frequently involve a model representing phonemic information at a local level, followed by a language model representing information at a nonlocal level. This structure is analogous to the local (e. g., equalizer) and nonlocal (e. g., error correction decoding) elements common in digital communications. Drawing from the analogy of turbo processing for digital communications, turbo speech processing iteratively feeds back the output of the language model to be used as prior probabilities for the phonemic model. This analogy is developed here, and the performance of this turbo model is characterized by using an artificial language model. Using turbo processing, the relative error rate improves significantly, especially in high-noise settings.
引用
收藏
页码:83 / 91
页数:9
相关论文
共 50 条
  • [21] Microphone Array Processing for Distant Speech Recognition
    Kumatani, Kenichi
    McDonough, John
    Raj, Bhiksha
    IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 127 - 140
  • [22] Speech recognition using Quantum signal processing
    Karthikeyan, S.
    Sasikumar, S.
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2006, 1 : 290 - 294
  • [23] PROCESSING UNKNOWN WORDS IN CONTINUOUS SPEECH RECOGNITION
    KITA, K
    EHARA, T
    MORIMOTO, T
    IEICE TRANSACTIONS ON COMMUNICATIONS ELECTRONICS INFORMATION AND SYSTEMS, 1991, 74 (07): : 1811 - 1816
  • [24] BINAURAL PROCESSING FOR ROBUST RECOGNITION OF DEGRADED SPEECH
    Menon, Anjali
    Kim, Chanwoo
    Kurokawa, Umpei
    Stern, Richard M.
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 24 - 31
  • [25] Wavelet processing for speech and acoustic patterns recognition
    Lentz, M
    Tate, C
    Ludu, A
    6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL IX, PROCEEDINGS: IMAGE, ACOUSTIC, SPEECH AND SIGNAL PROCESSING II, 2002, : 354 - 360
  • [26] Parallel Processing Capabilities in the Process of Speech Recognition
    Fazliddinovich, Rakhimov Mekhriddin
    Abdumurodovich, Berdanov Ulug'bek
    2017 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND COMMUNICATIONS TECHNOLOGIES (ICISCT) - APPLICATIONS, TRENDS AND OPPORTUNITIES, 2017,
  • [27] Study on the integration of speech and language processing in recognition of Chinese continuous speech
    Zhao, L.
    Zhou, C.R.
    Wu, Z.Y.
    Shengxue Xuebao/Acta Acustica, 2001, 26 (01): : 73 - 78
  • [28] NIST speech processing evaluations: LVCSR, speaker recognition, language recognition
    Martin, Alvin F.
    Garofolo, John S.
    2007 IEEE WORKSHOP ON SIGNAL PROCESSING APPLICATIONS FOR PUBLIC SECURITY AND FORENSICS, 2007, : 32 - +
  • [29] Robust Speech Recognition Based on Binaural Auditory Processing
    Menon, Anjali
    Kim, Chanwoo
    Stern, Richard M.
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3872 - 3876
  • [30] Factorial Speech Processing Models for Noise-Robust Automatic Speech Recognition
    Khademian, Mahdi
    Homayounpour, Mohammad Mehdi
    2015 23RD IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2015, : 637 - 642