Exploiting Context-Dependency and Acoustic Resolution of Universal Speech Attribute Models in Spoken Language Recognition

被引:0
|
作者
Siniscalchi, Sabato Marco [1 ]
Reed, Jeremy [2 ]
Svendsen, Torbjorn [3 ]
Lee, Chin-Hui [2 ]
机构
[1] Univ Enna Kore, Dept Telemat, Enna, Italy
[2] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA USA
[3] NTNU, Dept Elect & Telecommun, Trondheim, Norway
关键词
language identification; latent semantic analysis;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper expands a previously proposed universal acoustic characterization approach to spoken language identification (LID) by studying different ways of modeling attributes to improve language recognition. The motivation is to describe any spoken language with a common set of fundamental units. Thus, a spoken utterance is first tokenized into a sequence of universal attributes. Then a vector space modeling approach delivers the final LID decision. Context-dependent attribute models are now used to better capture spectral and temporal characteristics. Also, an approach to expand the set of attributes to increase the acoustic resolution is studied. Our experiments show that the tokenization accuracy positively affects LID results by producing a 2.8% absolute improvement over our previous 30-second NIST 2003 performance. This result also compares favorably with the best results on the same task known by the authors when the tokenizers are trained on language-dependent OGI-TS data.
引用
收藏
页码:2726 / +
页数:2
相关论文
共 32 条
  • [31] IMPROVING NON-AUTOREGRESSIVE END-TO-END SPEECH RECOGNITION WITH PRE-TRAINED ACOUSTIC AND LANGUAGE MODELS
    Deng, Keqi
    Yang, Zehui
    Watanabe, Shinji
    Higuchi, Yosuke
    Cheng, Gaofeng
    Zhang, Pengyuan
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8522 - 8526
  • [32] LANGUAGE-INVARIANT BOTTLENECK FEATURES FROM ADVERSARIAL END-TO-END ACOUSTIC MODELS FOR LOW RESOURCE SPEECH RECOGNITION
    Yi, Jiangyan
    Tao, Jianhua
    Bai, Ye
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6071 - 6075