Language Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks

被引:88
|
作者
Zazo, Ruben [1 ]
Lozano-Diez, Alicia [1 ]
Gonzalez-Dominguez, Javier [1 ]
Toledano, Doroteo T. [1 ]
Gonzalez-Rodriguez, Joaquin [1 ]
机构
[1] Univ Autonoma Madrid, ATVS Biometr Recognit Grp, Madrid, Spain
来源
PLOS ONE | 2016年 / 11卷 / 01期
关键词
SPEAKER;
D O I
10.1371/journal.pone.0146917
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Long Short Term Memory (LSTM) Recurrent Neural Networks (RNNs) have recently outperformed other state-of-the-art approaches, such as i-vector and Deep Neural Networks (DNNs), in automatic Language Identification (LID), particularly when dealing with very short utterances (similar to 3s). In this contribution we present an open-source, end-to-end, LSTM RNN system running on limited computational resources (a single GPU) that outperforms a reference i-vector system on a subset of the NIST Language Recognition Evaluation (8 target languages, 3s task) by up to a 26%. This result is in line with previously published research using proprietary LSTM implementations and huge computational resources, which made these former results hardly reproducible. Further, we extend those previous experiments modeling unseen languages (out of set, OOS, modeling), which is crucial in real applications. Results show that a LSTM RNN with OOS modeling is able to detect these languages and generalizes robustly to unseen OOS languages. Finally, we also analyze the effect of even more limited test data (from 2.25s to 0.1s) proving that with as little as 0.5s an accuracy of over 50% can be achieved.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Predicting machine's performance record using the stacked long short-term memory (LSTM) neural networks
    Ma, Min
    Liu, Chenbin
    Wei, Ran
    Liang, Bin
    Dai, Jianrong
    JOURNAL OF APPLIED CLINICAL MEDICAL PHYSICS, 2022, 23 (03):
  • [42] BIDIRECTIONAL QUATERNION LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS FOR SPEECH RECOGNITION
    Parcollet, Titouan
    Morchid, Mohamed
    Linares, Georges
    De Mori, Renato
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8519 - 8523
  • [43] Collective Anomaly Detection Based on Long Short-Term Memory Recurrent Neural Networks
    Bontemps, Loic
    Van Loi Cao
    McDermott, James
    Nhien-An Le-Khac
    FUTURE DATA AND SECURITY ENGINEERING, FDSE 2016, 2016, 10018 : 141 - 152
  • [44] Long Short-term Memory based on a Reward/punishment Strategy for Recurrent Neural Networks
    Liu, Jiangjiang
    Luo, Biao
    Yan, Pengfei
    Wang, Ding
    Liu, Derong
    2017 32ND YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2017, : 327 - 332
  • [45] Action Classification in Soccer Videos with Long Short-Term Memory Recurrent Neural Networks
    Baccouche, Moez
    Mamalet, Franck
    Wolf, Christian
    Garcia, Christophe
    Baskurt, Atilla
    ARTIFICIAL NEURAL NETWORKS-ICANN 2010, PT II, 2010, 6353 : 154 - +
  • [46] An analysis of Convolutional Long Short-Term Memory Recurrent Neural Networks for gesture recognition
    Tsironi, Eleni
    Barros, Pablo
    Weber, Cornelius
    Wermter, Stefan
    NEUROCOMPUTING, 2017, 268 : 76 - 86
  • [47] FPGA-based Accelerator for Long Short-Term Memory Recurrent Neural Networks
    Guan, Yijin
    Yuan, Zhihang
    Sun, Guangyu
    Cong, Jason
    2017 22ND ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2017, : 629 - 634
  • [48] Handwriting Recognition with Large Multidimensional Long Short-Term Memory Recurrent Neural Networks
    Voigtlaender, Paul
    Doetsch, Patrick
    Ney, Hermann
    PROCEEDINGS OF 2016 15TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2016, : 228 - 233
  • [49] Sequence Discriminative Distributed Training of Long Short-Term Memory Recurrent Neural Networks
    Sak, Hasim
    Vinyals, Oriol
    Heigold, Georg
    Senior, Andrew
    McDermott, Erik
    Monga, Rajat
    Mao, Mark
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1209 - 1213
  • [50] A System for Learning Atoms Based on Long Short-Term Memory Recurrent Neural Networks
    Quan, Zhe
    Lin, Xuan
    Wang, Zhi-Jie
    Liu, Yan
    Wang, Fan
    Li, Kenli
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 728 - 733