LOW-RESOURCE CONTEXTUAL TOPIC IDENTIFICATION ON SPEECH

被引：0

作者：

Liu, Chunxi ^{[1
]}

Wiesner, Matthew ^{[1
]}

Watanabe, Shinji ^{[1
]}

Harman, Craig ^{[1
]}

Trmal, Jan ^{[1
,2
]}

Dehak, Najim ^{[1
]}

Khudanpur, Sanjeev ^{[1
,2
]}

机构：

[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA

来源：

2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018) | 2018年

关键词：

Topic identification; universal acoustic modeling; recurrent neural networks; attention;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified. We first present a general purpose method for topic ID on spoken segments in low-resource languages, using a cascade of universal acoustic modeling, translation lexicons to English, and English-language topic classification. Next, instead of classifying each segment independently, we demonstrate that exploring the contextual dependencies across sequential segments can provide large improvements. In particular, we propose an attention-based contextual model which is able to leverage the contexts in a selective manner. We test both our contextual and non-contextual models on four LORELEI languages, and on all but one our attention-based contextual model significantly outperforms the context-independent models.

引用

页码：656 / 663

页数：8

共 50 条

[31] Low-Resource Speech Synthesis with Speaker-Aware Embedding
Yang, Li-Jen
Yeh, I-Ping
Chien, Jen-Tzung
2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 235 - 239
[32] Convolutional Maxout Neural Networks for Low-Resource Speech Recognition
Cai, Meng
Shi, Yongzhe
Kang, Jian
Liu, Jia
Su, Tengrong
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 133 - +
[33] Low-resource Sinhala Speech Recognition using Deep Learning
Karunathilaka, Hirunika
Welgama, Viraj
Nadungodage, Thilini
Weerasinghe, Ruvan
2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 196 - 201
[34] DISTRIBUTION AUGMENTATION FOR LOW-RESOURCE EXPRESSIVE TEXT-TO-SPEECH
Lajszczak, Mateusz
Prasad, Animesh
van Korlaar, Arent
Bollepalli, Bajibabu
Bonafonte, Antonio
Joly, Arnaud
Nicolis, Marco
Moinet, Alexis
Drugman, Thomas
Wood, Trevor
Sokolova, Elena
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8307 - 8311
[35] AlloST: Low-resource Speech Translation without Source Transcription
Cheng, Yao-Fei
Lee, Hung-Shin
Wang, Hsin-Min
INTERSPEECH 2021, 2021, : 2252 - 2256
[36] MIXSPEECH: DATA AUGMENTATION FOR LOW-RESOURCE AUTOMATIC SPEECH RECOGNITION
Meng, Linghui
Xu, Jin
Tan, Xu
Wang, Jindong
Qin, Tao
Xu, Bo
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7008 - 7012
[37] Multilingual acoustic models for speech recognition in low-resource devices
Garcia, Enrique Gil
Mengusoglu, Erhan
Janke, Eric
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 981 - +
[38] Language fusion via adapters for low-resource speech recognition
Hu, Qing
Zhang, Yan
Zhang, Xianlei
Han, Zongyu
Liang, Xiuxia
SPEECH COMMUNICATION, 2024, 158
[39] Weighted Gradient Pretrain for Low-Resource Speech Emotion Recognition
Xie, Yue
Liang, Ruiyu
Zhao, Xiaoyan
Liang, Zhenlin
Du, Jing
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (07) : 1352 - 1355
[40] Meta adversarial learning improves low-resource speech recognition
Chen, Yaqi
Yang, Xukui
Zhang, Hao
Zhang, Wenlin
Qu, Dan
Chen, Cong
COMPUTER SPEECH AND LANGUAGE, 2024, 84

← 1 2 3 4 5 →