LOW-RESOURCE CONTEXTUAL TOPIC IDENTIFICATION ON SPEECH

被引:0
|
作者
Liu, Chunxi [1 ]
Wiesner, Matthew [1 ]
Watanabe, Shinji [1 ]
Harman, Craig [1 ]
Trmal, Jan [1 ,2 ]
Dehak, Najim [1 ]
Khudanpur, Sanjeev [1 ,2 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
关键词
Topic identification; universal acoustic modeling; recurrent neural networks; attention;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified. We first present a general purpose method for topic ID on spoken segments in low-resource languages, using a cascade of universal acoustic modeling, translation lexicons to English, and English-language topic classification. Next, instead of classifying each segment independently, we demonstrate that exploring the contextual dependencies across sequential segments can provide large improvements. In particular, we propose an attention-based contextual model which is able to leverage the contexts in a selective manner. We test both our contextual and non-contextual models on four LORELEI languages, and on all but one our attention-based contextual model significantly outperforms the context-independent models.
引用
收藏
页码:656 / 663
页数:8
相关论文
共 50 条
  • [41] LOW-RESOURCE SYSTEM FOR ALL-DIGITAL SPEECH SYNTHESIS
    HERMAN, G
    DUQUET, RT
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 61 : S68 - S68
  • [42] Tackling Hate Speech in Low-resource Languages with Context Experts
    Nkemelu, Daniel
    Shah, Harshil
    Essa, Irfan
    Best, Michael L.
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES AND DEVELOPMENT, ICTD 2022, 2022,
  • [43] STOCHASTIC POOLING MAXOUT NETWORKS FOR LOW-RESOURCE SPEECH RECOGNITION
    Cai, Meng
    Shi, Yongzhe
    Liu, Jia
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [44] EXPLORING EFFECTIVE DATA UTILIZATION FOR LOW-RESOURCE SPEECH RECOGNITION
    Zhou, Zhikai
    Wang, Wei
    Zhang, Wangyou
    Qian, Yanmin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8192 - 8196
  • [45] META-LEARNING FOR LOW-RESOURCE SPEECH EMOTION RECOGNITION
    Chopra, Suransh
    Mathur, Puneet
    Sawhney, Ramit
    Shah, Rajiv Ratn
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6259 - 6263
  • [46] Acoustic Modeling for Hindi Speech Recognition in Low-Resource Settings
    Dey, Anik
    Zhang, Weibin
    Fung, Pascale
    2014 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), VOLS 1-2, 2014, : 891 - 894
  • [47] MixRep: Hidden Representation Mixup for Low-Resource Speech Recognition
    Xie, Jiamin
    Hansen, John H. L.
    INTERSPEECH 2023, 2023, : 1304 - 1308
  • [48] Adversarial Meta Sampling for Multilingual Low-Resource Speech Recognition
    Xiao, Yubei
    Gong, Ke
    Zhou, Pan
    Zheng, Guolin
    Liang, Xiaodan
    Lin, Liang
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14112 - 14120
  • [49] Review of Speech Synthesis Methods Under Low-Resource Condition
    Jialin, Zhang
    Wushouer, Mairidan
    Tuerhong, Gulanbaier
    Computer Engineering and Applications, 2023, 59 (15): : 1 - 16
  • [50] ANALYSIS OF X-VECTORS FOR LOW-RESOURCE SPEECH RECOGNITION
    Karafiat, Martin
    Vesely, Karel
    Cernocky, Jan Honza
    Profant, Jan
    Nytra, Jiri
    Hlavacek, Miroslav
    Pavlicek, Tomas
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6998 - 7002