Chinese dialect speech recognition: a comprehensive survey

被引:1
|
作者
Li, Qiang [1 ]
Mai, Qianyu [1 ]
Wang, Mandou [1 ]
Ma, Mingjuan [2 ]
机构
[1] North Minzu Univ, Sch Comp Sci & Engn, Yinchuan 750021, Peoples R China
[2] North Minzu Univ, Sch Econ, Yinchuan 750021, Peoples R China
基金
中国国家自然科学基金;
关键词
Chinese dialect; Dialect corpus; Dialectal acoustic modeling; Automatic speech recognition; Deep neural network; End-to-end; EMOTION RECOGNITION; MODELS; IDENTIFICATION; FEATURES; CORPUS;
D O I
10.1007/s10462-023-10668-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a multi-ethnic country with a large population, China is endowed with diverse dialects, which brings considerable challenges to speech recognition work. In fact, due to geographical location, population migration, and other factors, the research progress and practical application of Chinese dialect speech recognition are currently at different stages. Therefore, exploring the significant regional heterogeneities in specific recognition approaches and effects, dialect corpus, and other resources is of vital importance for Chinese speech recognition work. Based on this, we first start with the regional classification of dialects and analyze the pivotal acoustic characteristics of dialects, including specific vowels and tones patterns. Secondly, we comprehensively summarize the existing dialect phonetic corpus in China, which is of some assistance in exploring the general construction methods of dialect phonetic corpus. Moreover, we expound on the general process of dialect recognition. Several critical dialect recognition approaches are summarized and introduced in detail, especially the hybrid method of Artificial Neural Network (ANN) combined with the Hidden Markov Model(HMM), as well as the End-to-End (E2E). Thirdly, through the in-depth comparison of their principles, merits, disadvantages, and recognition performance for different dialects, the development trends and challenges in dialect recognition in the future are pointed out. Finally, some application examples of dialect speech recognition are collected and discussed.
引用
收藏
页数:39
相关论文
共 50 条
  • [31] Speech Recognition of Moroccan Dialect Using Hidden Markov Models
    Mouaz, Bezoui
    Abderrahim, Beni Hssane
    Abdelmajid, Elmoutaouakkil
    10TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT 2019) / THE 2ND INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40 2019) / AFFILIATED WORKSHOPS, 2019, 151 : 985 - 991
  • [32] Arabic Speech Emotion Recognition From Saudi Dialect Corpus
    Aljuhani, Reem Hamed
    Alshutayri, Areej
    Alahdal, Shahd
    IEEE ACCESS, 2021, 9 : 127081 - 127085
  • [33] Automatic speech recognition for Moroccan dialect in noisy traffic environments
    Ezzine, Abderrahim
    Laaidi, Naouar
    Satori, Hassan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 141
  • [34] A Survey of Automatic Speech Recognition for Dysarthric Speech
    Qian, Zhaopeng
    Xiao, Kejing
    ELECTRONICS, 2023, 12 (20)
  • [35] A survey on automatic speech recognition
    Nakagawa, Seiichi
    IEICE Transactions on Information and Systems, 2002, E85-D (03) : 465 - 486
  • [36] Automatic speech recognition: a survey
    Malik, Mishaim
    Malik, Muhammad Kamran
    Mehmood, Khawar
    Makhdoom, Imran
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (06) : 9411 - 9457
  • [37] Automatic speech recognition: a survey
    Mishaim Malik
    Muhammad Kamran Malik
    Khawar Mehmood
    Imran Makhdoom
    Multimedia Tools and Applications, 2021, 80 : 9411 - 9457
  • [38] Domain Expansion for End-to-End Speech Recognition: Applications for Accent/Dialect Speech
    Ghorbani, Shahram
    Hansen, John H. L.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 762 - 774
  • [39] Scene recognition: A comprehensive survey
    Xie, Lin
    Lee, Feifei
    Liu, Li
    Kotani, Koji
    Chen, Qiu
    PATTERN RECOGNITION, 2020, 102
  • [40] The Chinese lexicon: a comprehensive survey
    Song, L
    BULLETIN OF THE SCHOOL OF ORIENTAL AND AFRICAN STUDIES-UNIVERSITY OF LONDON, 2002, 65 : 229 - 231