A review into deep learning techniques for spoken language identification

被引:0
|
作者
Irshad Ahmad Thukroo
Rumaan Bashir
Kaiser J. Giri
机构
[1] Islamic University of Science & Technology,Department of Computer Science
来源
关键词
Spoken language identification; Gaussian mixture model; Support vector machine; Hidden Markov model; Deep neural networks; Artificial neural network; Feed-forward neural network; Recurrent neural network; Convolutional neural network; Ensemble learning; hybridization approaches; Mel frequency cepstral coefficient features;
D O I
暂无
中图分类号
学科分类号
摘要
Information Technology has touched new vistas for a couple of decades mostly to simplify the day-to-day life of the humans. One of the key contributions of Information Technology is the application of Artificial Intelligence to achieve better results. The advent of artificial intelligence has given rise to a new branch of Natural Language Processing (NLP) called Computational Linguistics, which generates frameworks for intelligently manipulating spoken language knowledge and has brought human-machine onto a new stage. In this context, speech has arisen to be one of the imperative forms of interfaces, which is the basic mode of communication for us, and generally the most preferred one. Language identification, being the front-end for various natural language processing tasks, plays an important role in language translation. Owing to this, the focus has been given on the field of speech recognition involving the identification & recognition of languages by a machine. Spoken language identification is the identification of language present in a speech segment despite its size (duration & speed), ambiance (topic & emotion), and moderator (gender, age, demographic region). This paper has investigated various existing spoken language identification models implemented using different deep learning approaches, datasets, and performance measures utilized for their analysis. It also highlights the main features and challenges faced by these models. A comprehensive comparative study of deep learning techniques has been carried out for spoken language identification. Moreover, this review analyzes the efficiency of the spoken language models that can help the researchers to propose new language identification models for speech signals.
引用
收藏
页码:32593 / 32624
页数:31
相关论文
共 50 条
  • [1] A review into deep learning techniques for spoken language identification
    Thukroo, Irshad Ahmad
    Bashir, Rumaan
    Giri, Kaiser J.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (22) : 32593 - 32624
  • [2] Spoken Language Identification for Native Indian Languages Using Deep Learning Techniques
    Kulkarni, Rushikesh
    Joshi, Aditi
    Kamble, Milind
    Apte, Shaila
    MACHINE LEARNING AND AUTONOMOUS SYSTEMS, 2022, 269 : 75 - 97
  • [3] Spoken Language Identification Using Deep Learning
    Singh, Gundeep
    Sharma, Sahil
    Kumar, Vijay
    Kaur, Manjit
    Baz, Mohammed
    Masud, Mehedi
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [4] Comparative Study on Spoken Language Identification Based on Deep Learning
    Heracleous, Panikos
    Takai, Kohichi
    Yasuda, Keiji
    Mohammad, Yasser
    Yoneyama, Akio
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 2265 - 2269
  • [5] Deep Bottleneck Features for Spoken Language Identification
    Jiang, Bing
    Song, Yan
    Wei, Si
    Liu, Jun-Hua
    McLoughlin, Ian Vince
    Dai, Li-Rong
    PLOS ONE, 2014, 9 (07):
  • [6] FuzzyGCP: A deep learning architecture for automatic spoken language identification from speech signals
    Garain, Avishek
    Singh, Pawan Kumar
    Sarkar, Ram
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 168
  • [7] Deep learning for spoken language identification: Can we visualize speech signal patterns?
    Mukherjee, Himadri
    Ghosh, Subhankar
    Sen, Shibaprasad
    Obaidullah, Sk Md
    Santosh, K. C.
    Phadikar, Santanu
    Roy, Kaushik
    NEURAL COMPUTING & APPLICATIONS, 2019, 31 (12): : 8483 - 8501
  • [8] Deep learning for spoken language identification: Can we visualize speech signal patterns?
    Himadri Mukherjee
    Subhankar Ghosh
    Shibaprasad Sen
    Obaidullah Sk Md
    K. C. Santosh
    Santanu Phadikar
    Kaushik Roy
    Neural Computing and Applications, 2019, 31 : 8483 - 8501
  • [9] DEEP MULTIMODAL LEARNING FOR EMOTION RECOGNITION IN SPOKEN LANGUAGE
    Gu, Yue
    Chen, Shuhong
    Marsic, Ivan
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5079 - 5083
  • [10] Incorporating Uncertainty into Deep Learning for Spoken Language Assessment
    Malinin, Andrey
    Ragni, Anton
    Knill, Kate M.
    Gales, Mark J. F.
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, : 45 - 50