TRANSCRIPTION OF MULTI-GENRE MEDIA ARCHIVES USING OUT-OF-DOMAIN DATA

被引:0
|
作者
Bell, P. J. [1 ]
Gales, M. J. F.
Lanchantin, P.
Liu, X.
Long, Y.
Renals, S. [1 ]
Swietojanski, P. [1 ]
Woodland, P. C.
机构
[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9AB, Midlothian, Scotland
基金
英国工程与自然科学研究理事会;
关键词
speech recognition; tandem; cross-domain adaptation; media archives;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We describe our work on developing a speech recognition system for multi-genre media archives. The high diversity of the data makes this a challenging recognition task, which may benefit from systems trained on a combination of in-domain and out-of-domain data. Working with tandem HMMs, we present Multi-level Adaptive Networks (MLAN), a novel technique for incorporating information from out-of-domain posterior features using deep neural networks. We show that it provides a substantial reduction in WER over other systems, with relative WER reductions of 15% over a PLP baseline, 9% over in-domain tandem features and 8% over the best out-of-domain tandem features.
引用
收藏
页码:324 / 329
页数:6
相关论文
共 50 条
  • [21] Slot Filling with Weighted Multi-Encoders for Out-of-Domain Values
    Kobayashi, Yuka
    Yoshida, Takami
    Iwata, Kenji
    Fujimura, Hiroshi
    INTERSPEECH 2019, 2019, : 854 - 858
  • [22] A multi-genre model for music emotion recognition using linear regressors
    Griffiths, Darryl
    Cunningham, Stuart
    Weinel, Jonathan
    Picking, Richard
    JOURNAL OF NEW MUSIC RESEARCH, 2021, 50 (04) : 355 - 372
  • [23] Improving Out-of-domain Sentiment Polarity Classification using Argumentation
    Carstens, Lucas
    Toni, Francesca
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 1294 - 1301
  • [24] Combining the Predictions of Out-of-Domain Classifiers Using Etcetera Abduction
    Gordon, Andrew S.
    Feng, Andrew
    2024 58TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS, CISS, 2024,
  • [25] Combining in-domain and out-of-domain speech data for automatic recognition of disordered speech
    Christensen, H.
    Aniol, M. B.
    Bell, P.
    Green, P.
    Hain, T.
    King, S.
    Swietojanski, P.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3609 - 3612
  • [26] Improving Speaker Verification Performance in Presence of Spoofing Attacks Using Out-of-Domain Spoofed Data
    Sarkar, Achintya Kr.
    Sahidullah, Md.
    Tan, Zheng-Hua
    Kinnunen, Tomi
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2611 - 2615
  • [27] Multi-Relational Graph Neural Network for Out-of-Domain Link Prediction
    Sattar, Asma
    Deligiorgis, Georgios
    Trincavelli, Marco
    Bacciu, Davide
    2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024, 2024,
  • [28] Automatic speaker verification system for dysarthric speakers using prosodic features and out-of-domain data augmentation
    Salim, Shinimol
    Shahnawazuddin, Syed
    Ahmad, Waquar
    APPLIED ACOUSTICS, 2023, 210
  • [29] Learning from noisy out-of-domain corpus using dataless classification
    Jin, Yiping
    Wanvarie, Dittaya
    Le, Phu T., V
    NATURAL LANGUAGE ENGINEERING, 2022, 28 (01) : 39 - 69
  • [30] Improving Children's Speech Recognition through Out-of-Domain Data Augmentation
    Fainberg, Joachim
    Bell, Peter
    Lincoln, Mike
    Renals, Steve
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1598 - 1602