A Survey of Grapheme-to-Phoneme Conversion Methods

被引:0
|
作者
Cheng, Shiyang [1 ]
Zhu, Pengcheng [2 ]
Liu, Jueting [2 ]
Wang, Zehua [3 ]
机构
[1] China Univ Min & Technol, Sch Environm Sci & Spatial Informat, 1 Daxue Rd, Xuzhou 221000, Peoples R China
[2] China Univ Min & Technol, Sch Comp Sci & Technol, 1 Daxue Rd, Xuzhou 221000, Jiangsu, Peoples R China
[3] Univ British Columbia, Dept Elect & Comp Engn, 2332 Main Mall, Vancouver, BC V6T 1Z4, Canada
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 24期
关键词
grapheme-to-phoneme conversion; speech synthesis; machine learning; deep learning;
D O I
10.3390/app142411790
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Grapheme-to-phoneme conversion (G2P) is the task of converting letters (grapheme sequences) into their pronunciations (phoneme sequences). It plays a crucial role in natural language processing, text-to-speech synthesis, and automatic speech recognition systems. This paper provides a systematical overview of the G2P conversion from different perspectives. The conversion methods are first presented in the paper; detailed discussions are conducted on methods based on deep learning technology. For each method, the key ideas, advantages, disadvantages, and representative models are summarized. This paper then mentioned the learning strategies and multilingual G2P conversions. Finally, this paper summarized the commonly used monolingual and multilingual datasets, including Mandarin, Japanese, Arabic, etc. Two tables illustrated the performance of various methods with relative datasets. After making a general overall of G2P conversion, this paper concluded with the current issues and the future directions of deep learning-based G2P conversion.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] GRAPHEME-TO-PHONEME CONVERSION METHODS FOR MINORITY LANGUAGE CONDITIONS
    Cao, Mengxue
    Renals, Steve
    Bell, Peter
    Li, Aijun
    Fang, Qiang
    2012 INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2012, : 151 - 156
  • [2] Fast Bilingual Grapheme-To-Phoneme Conversion
    Kim, Hwa-Yeon
    Kim, Jong-Hwan
    Kim, Jae-Min
    2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2022, 2022, : 289 - 296
  • [3] Transformer based Grapheme-to-Phoneme Conversion
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    INTERSPEECH 2019, 2019, : 2095 - 2099
  • [4] Frustratingly Easy Multilingual Grapheme-to-Phoneme Conversion
    Prabhu, Nikhil
    Kann, Katharina
    17TH SIGMORPHON WORKSHOP ON COMPUTATIONAL RESEARCH IN PHONETICS PHONOLOGY, AND MORPHOLOGY (SIGMORPHON 2020), 2020, : 123 - 127
  • [5] Grapheme-to-Phoneme Conversion with Convolutional Neural Networks
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    APPLIED SCIENCES-BASEL, 2019, 9 (06):
  • [6] Grapheme-to-phoneme conversion in Chinese TTS system
    Dong, HH
    Tao, JH
    Xu, B
    2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 165 - 168
  • [7] Label Embedding for Chinese Grapheme-to-Phoneme Conversion
    Choi, Eunbi
    Kim, Hwa-Yeon
    Kim, Jong-Hwan
    Kim, Jae-Min
    INTERSPEECH 2021, 2021, : 4094 - 4098
  • [8] Automatic Grapheme-to-Phoneme Conversion of Arabic Text
    Al-Daradkah, Belal
    Al-Diri, Bashir
    2015 SCIENCE AND INFORMATION CONFERENCE (SAI), 2015, : 468 - 473
  • [9] NARROWADAPTIVE REGULARIZATION OF WEIGHTS FOR GRAPHEME-TO-PHONEME CONVERSION
    Kubo, Keigo
    Sakti, Sakriani
    Neubig, Graham
    Toda, Tomoki
    Nakamura, Satoshi
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [10] Learning from Errors in Grapheme-to-Phoneme Conversion
    Polyakova, Tatyana
    Bonafonte, Antonio
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2442 - 2445