TRANSCRIPTION OF MULTI-GENRE MEDIA ARCHIVES USING OUT-OF-DOMAIN DATA

被引:0
|
作者
Bell, P. J. [1 ]
Gales, M. J. F.
Lanchantin, P.
Liu, X.
Long, Y.
Renals, S. [1 ]
Swietojanski, P. [1 ]
Woodland, P. C.
机构
[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9AB, Midlothian, Scotland
基金
英国工程与自然科学研究理事会;
关键词
speech recognition; tandem; cross-domain adaptation; media archives;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We describe our work on developing a speech recognition system for multi-genre media archives. The high diversity of the data makes this a challenging recognition task, which may benefit from systems trained on a combination of in-domain and out-of-domain data. Working with tandem HMMs, we present Multi-level Adaptive Networks (MLAN), a novel technique for incorporating information from out-of-domain posterior features using deep neural networks. We show that it provides a substantial reduction in WER over other systems, with relative WER reductions of 15% over a PLP baseline, 9% over in-domain tandem features and 8% over the best out-of-domain tandem features.
引用
收藏
页码:324 / 329
页数:6
相关论文
共 50 条
  • [1] THE 2015 SHEFFIELD SYSTEM FOR TRANSCRIPTION OF MULTI-GENRE BROADCAST MEDIA
    Saz, Oscar
    Doulaty, Mortaza
    Deena, Salil
    Milner, Rosanna
    Ng, Raymond W. M.
    Hasan, Madina
    Liu, Yulan
    Hain, Thomas
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 624 - 631
  • [2] CRIM AND LIUM APPROACHES FOR MULTI-GENRE BROADCAST MEDIA TRANSCRIPTION
    Gupta, Vishwa
    Deleglise, Paul
    Boulianne, Gilles
    Esteve, Yannick
    Meignier, Sylvain
    Rousseau, Anthony
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 681 - 686
  • [3] PUNCTUATED TRANSCRIPTION OF MULTI-GENRE BROADCASTS USING ACOUSTIC AND LEXICAL APPROACHES
    Klejch, Ondrej
    Bell, Peter
    Renals, Steve
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 433 - 440
  • [4] PHONETIC AND GRAPHEMIC SYSTEMS FOR MULTI-GENRE BROADCAST TRANSCRIPTION
    Wang, Y.
    Chen, X.
    Gales, M. J. F.
    Ragni, A.
    Wong, J. H. M.
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5899 - 5903
  • [5] Using out-of-domain data to improve on-domain language models
    Iyer, R
    Ostendorf, M
    Gish, H
    IEEE SIGNAL PROCESSING LETTERS, 1997, 4 (08) : 221 - 223
  • [6] CAMBRIDGE UNIVERSITY TRANSCRIPTION SYSTEMS FOR THE MULTI-GENRE BROADCAST CHALLENGE
    Woodland, P. C.
    Liu, X.
    Qian, Y.
    Zhang, C.
    Gales, M. J. F.
    Karanasou, P.
    Lanchantin, P.
    Wang, L.
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 639 - 646
  • [7] CRIM's System for the MGB-3 English Multi-Genre Broadcast Media Transcription
    Gupta, Vishwa
    Boulianne, Gilles
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2653 - 2657
  • [8] THE NDSC TRANSCRIPTION SYSTEM FOR THE 2016 MULTI-GENRE BROADCAST CHALLENGE
    Yang, Xu-Kui
    Qu, Dan
    Zhang, Wen-Lin
    Zhang, Wei-Qiang
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 273 - 278
  • [9] AUTOMATIC SPEECH RECOGNITION OF ARABIC MULTI-GENRE BROADCAST MEDIA
    Najafian, Maryam
    Hsu, Wei-Ning
    Ali, Ahmed
    Glass, James
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 353 - 359
  • [10] THE MGB CHALLENGE: EVALUATING MULTI-GENRE BROADCAST MEDIA RECOGNITION
    Bell, P.
    Gales, M. J. F.
    Hain, T.
    Kilgour, J.
    Lanchantin, P.
    Liu, X.
    McParland, A.
    Renals, S.
    Saz, O.
    Wester, M.
    Woodland, P. C.
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 687 - 693