TRANSCRIPTION OF MULTI-GENRE MEDIA ARCHIVES USING OUT-OF-DOMAIN DATA

被引：0

作者：

Bell, P. J. ^{[1
]}

Gales, M. J. F.

Lanchantin, P.

Liu, X.

Long, Y.

Renals, S. ^{[1
]}

Swietojanski, P. ^{[1
]}

Woodland, P. C.

机构：

[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9AB, Midlothian, Scotland

来源：

2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012) | 2012年

基金：

英国工程与自然科学研究理事会;

关键词：

speech recognition; tandem; cross-domain adaptation; media archives;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We describe our work on developing a speech recognition system for multi-genre media archives. The high diversity of the data makes this a challenging recognition task, which may benefit from systems trained on a combination of in-domain and out-of-domain data. Working with tandem HMMs, we present Multi-level Adaptive Networks (MLAN), a novel technique for incorporating information from out-of-domain posterior features using deep neural networks. We show that it provides a substantial reduction in WER over other systems, with relative WER reductions of 15% over a PLP baseline, 9% over in-domain tandem features and 8% over the best out-of-domain tandem features.

引用

页码：324 / 329

页数：6

共 50 条

[21] Slot Filling with Weighted Multi-Encoders for Out-of-Domain Values
Kobayashi, Yuka
Yoshida, Takami
Iwata, Kenji
Fujimura, Hiroshi
INTERSPEECH 2019, 2019, : 854 - 858
[22] A multi-genre model for music emotion recognition using linear regressors
Griffiths, Darryl
Cunningham, Stuart
Weinel, Jonathan
Picking, Richard
JOURNAL OF NEW MUSIC RESEARCH, 2021, 50 (04) : 355 - 372
[23] Improving Out-of-domain Sentiment Polarity Classification using Argumentation
Carstens, Lucas
Toni, Francesca
2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 1294 - 1301
[24] Combining the Predictions of Out-of-Domain Classifiers Using Etcetera Abduction
Gordon, Andrew S.
Feng, Andrew
2024 58TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS, CISS, 2024,
[25] Combining in-domain and out-of-domain speech data for automatic recognition of disordered speech
Christensen, H.
Aniol, M. B.
Bell, P.
Green, P.
Hain, T.
King, S.
Swietojanski, P.
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3609 - 3612
[26] Improving Speaker Verification Performance in Presence of Spoofing Attacks Using Out-of-Domain Spoofed Data
Sarkar, Achintya Kr.
Sahidullah, Md.
Tan, Zheng-Hua
Kinnunen, Tomi
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2611 - 2615
[27] Multi-Relational Graph Neural Network for Out-of-Domain Link Prediction
Sattar, Asma
Deligiorgis, Georgios
Trincavelli, Marco
Bacciu, Davide
2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024, 2024,
[28] Automatic speaker verification system for dysarthric speakers using prosodic features and out-of-domain data augmentation
Salim, Shinimol
Shahnawazuddin, Syed
Ahmad, Waquar
APPLIED ACOUSTICS, 2023, 210
[29] Learning from noisy out-of-domain corpus using dataless classification
Jin, Yiping
Wanvarie, Dittaya
Le, Phu T., V
NATURAL LANGUAGE ENGINEERING, 2022, 28 (01) : 39 - 69
[30] Improving Children's Speech Recognition through Out-of-Domain Data Augmentation
Fainberg, Joachim
Bell, Peter
Lincoln, Mike
Renals, Steve
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1598 - 1602

← 1 2 3 4 5 →