Chinese dialect speech recognition: a comprehensive survey

被引:1
|
作者
Li, Qiang [1 ]
Mai, Qianyu [1 ]
Wang, Mandou [1 ]
Ma, Mingjuan [2 ]
机构
[1] North Minzu Univ, Sch Comp Sci & Engn, Yinchuan 750021, Peoples R China
[2] North Minzu Univ, Sch Econ, Yinchuan 750021, Peoples R China
基金
中国国家自然科学基金;
关键词
Chinese dialect; Dialect corpus; Dialectal acoustic modeling; Automatic speech recognition; Deep neural network; End-to-end; EMOTION RECOGNITION; MODELS; IDENTIFICATION; FEATURES; CORPUS;
D O I
10.1007/s10462-023-10668-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a multi-ethnic country with a large population, China is endowed with diverse dialects, which brings considerable challenges to speech recognition work. In fact, due to geographical location, population migration, and other factors, the research progress and practical application of Chinese dialect speech recognition are currently at different stages. Therefore, exploring the significant regional heterogeneities in specific recognition approaches and effects, dialect corpus, and other resources is of vital importance for Chinese speech recognition work. Based on this, we first start with the regional classification of dialects and analyze the pivotal acoustic characteristics of dialects, including specific vowels and tones patterns. Secondly, we comprehensively summarize the existing dialect phonetic corpus in China, which is of some assistance in exploring the general construction methods of dialect phonetic corpus. Moreover, we expound on the general process of dialect recognition. Several critical dialect recognition approaches are summarized and introduced in detail, especially the hybrid method of Artificial Neural Network (ANN) combined with the Hidden Markov Model(HMM), as well as the End-to-End (E2E). Thirdly, through the in-depth comparison of their principles, merits, disadvantages, and recognition performance for different dialects, the development trends and challenges in dialect recognition in the future are pointed out. Finally, some application examples of dialect speech recognition are collected and discussed.
引用
收藏
页数:39
相关论文
共 50 条
  • [41] The Chinese lexicon: A comprehensive survey
    不详
    FORUM FOR MODERN LANGUAGE STUDIES, 2007, 43 (04) : 480 - 480
  • [42] Thai Dialect Corpus and Transfer-based Curriculum Learning Investigation for Dialect Automatic Speech Recognition
    Suwanbandit, Artit
    Naowarat, Burin
    Sangpetch, Orathai
    Chuangsuwanich, Ekapol
    INTERSPEECH 2023, 2023, : 4069 - 4073
  • [43] Research on Tibetan Speech Recognition Based on the Am-do Dialect
    Khysru, Kuntharrgyal
    Wei, Jianguo
    Dang, Jianwu
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 73 (03): : 4897 - 4907
  • [44] An open speech resource for Tibetan multi-dialect and multitask recognition
    Zhao, Yue
    Xu, Xiaona
    Yue, Jianjian
    Song, Wei
    Li, Xiali
    Wu, Licheng
    Ji, Qiang
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2020, 22 (2-3) : 297 - 304
  • [45] Emotion recognition from Moroccan dialect speech and energy band distribution
    Agrima, Abdellah
    Farchi, Abdelmajid
    Elmazouzi, Laila
    Mounir, Ilham
    Mounir, Badia
    2019 INTERNATIONAL CONFERENCE ON WIRELESS TECHNOLOGIES, EMBEDDED AND INTELLIGENT SYSTEMS (WITS), 2019,
  • [46] An experiment of Moroccan dialect speech recognition in noisy environments using PocketSphinx
    Ouisaadane A.
    Safi S.
    Frikel M.
    International Journal of Speech Technology, 2024, 27 (02) : 329 - 339
  • [47] Global RNN Transducer Models For Multi-dialect Speech Recognition
    Fukuda, Takashi
    Thomas, Samuel
    Suzuki, Masayuki
    Kurata, Gakuto
    Saon, George
    Kingsbury, Brian
    INTERSPEECH 2022, 2022, : 3138 - 3142
  • [48] Improving Automatic Speech Recognition with Dialect-Specific Language Models
    Gothi, Raj
    Rao, Preeti
    SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 57 - 67
  • [49] Real-Time Continuous Tamil Dialect Speech Recognition and Summarization
    S. Saranya
    B. Bharathi
    S. Gomathy Dhanya
    Aishwarya Krishnakumar
    Circuits, Systems, and Signal Processing, 2025, 44 (4) : 2855 - 2881
  • [50] A Dialectal Chinese Speech Recognition Framework
    Jing Li
    Thomas Fang Zheng
    William Byrne
    Dan Jurafsky
    Journal of Computer Science and Technology, 2006, 21 : 106 - 115