Northern Thai Dialect Text to Speech

被引:0
|
作者
Chao-angthong, Pannakorn [1 ]
Suchato, Atiwong [1 ]
Punyabukkana, Proadpran [1 ]
机构
[1] Chulalongkorn Univ, Fac Engn, Dept Comp Engn, Spoken Language Syst Res Grp, Bangkok, Thailand
关键词
Text to speech system; Grapheme to phoneme conversion; Northern Thai dialect; Speech corpus;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Each of the dialects of Thai Language has a distinct identity associated with its accents. The conversation between different native speakers of these dialects despite their standard language origination cannot be avoided when visiting each region. Communication with people who understand only the Northern Thai Dialect (NTD) brought us to the idea of inventing the Northern Thai Dialect Text to Speech (NTD-TTS). This idea derives from the same concept as a translating program; after getting text input in the Center Thai Dialect (CTD), the TTS system will translate and synthesize speech output in NTD. TTS used a software structure and modified two components: Grapheme to Phoneme (G2P) and Speech models. The NTD-G2P conversion was created by using rule-based and dictionary-based approaches. It was evaluated by 100 randomly selected sentences from ORCHID. The NTD-G2P reports a conversion accuracy of 83.19% on the syllable level and it is used for implementing the NTD-corpus. The sentence selections were presented to train the NTD speech model. The selection chosen covers 95.32% in the first percentile of phoneme distribution in the NTD-corpus. After connecting the speech models to the TTS system, the whole system was evaluated with Mean Opinion Score (MOS) and the comprehension on the syllable level by the native speakers. The NTD-MOS evaluations indicated that the accent, naturalness, and intelligibility of synthetic speech ranged from "acceptable" to "good". The test set of the NTD-TTS system earned a good MOS and high comprehension percentage from the NTD native listeners. The results are 3.73 in the accent, 3.68 in the naturalness, 3.63 in the intelligibility, and the comprehension percentage is 97.16%.
引用
收藏
页数:6
相关论文
共 50 条
  • [11] State of the Art Review on Thai Text-to-Speech System
    Yimngam, Sukanya
    Premchaisawadi, Wichian
    Kreesuradej, Worapoj
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, 2008, : 194 - +
  • [12] A Parallel Corpus for Vietnamese Central-Northern Dialect Text Transfer
    Thanh Le
    Tuan, Luu Anh
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 13839 - 13855
  • [13] Cross-Dialect Adaptation Framework for Constructing Prosodic Models for Chinese Dialect Text-to-Speech Systems
    Chiang, Chen-Yu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (01) : 108 - 121
  • [14] PaSCoNT - Parallel Speech Corpus of Northern-central Thai for automatic speech recognition
    Taerungruang, Supawat
    Taninpong, Phimphaka
    Chunwijitra, Vataya
    Thatphithakkul, Sumonmas
    Kasuriya, Sawit
    Inthanon, Viroj
    Paksaranuwat, Pawat
    Thumronglaohapun, Salinee
    Nakharutai, Nawapon
    Inkeaw, Papangkorn
    Bootkrajang, Jakramate
    COMPUTER SPEECH AND LANGUAGE, 2025, 89
  • [15] COLOMBIAN DIALECT RECOGNITION BASED ON INFORMATION EXTRACTED FROM SPEECH AND TEXT SIGNALS
    Escobar-Grisales, D.
    Rios-Urrego, C. D.
    Lopez-Santander, D. A.
    Gallo-Aristizabal, J. D.
    Vasquez-Correa, J. C.
    Noeth, E.
    Orozco-Arroyave, J. R.
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 556 - 563
  • [16] Thai syllable analysis for rule-based text to speech system
    Narupiyakul, L. (lalita.nar@mahidol.ac.th), 1600, IOS Press BV (253):
  • [17] End-to-End Thai Text-to-Speech with Linguistic Unit
    Wisetpaitoon, Kontawat
    Singkul, Sattaya
    Sakdejayont, Theerat
    Chalothorn, Tawunrat
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 951 - 959
  • [18] A stochastic knowledge-based Thai text-to-speech system
    Narupiyakul, L
    Khumya, A
    Sirinaovakul, B
    Cercone, N
    MATHEMATICAL AND COMPUTER MODELLING, 2005, 42 (1-2) : 1 - 16
  • [19] Implementing Thai Text-to-Speech Synthesis for Hand-held Devices
    Chinathimatmongkhon, Nipon
    Suchato, Atiwong
    Punyabukkana, Proadpran
    ECTI-CON 2008: PROCEEDINGS OF THE 2008 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2008, : 545 - 548
  • [20] Dialect speech and wages
    Yao, Yuxin
    van Ours, Jan C.
    ECONOMICS LETTERS, 2019, 177 : 35 - 38