A RESEARCH BED FOR UNIT SELECTION BASED TEXT TO SPEECH SYNTHESIS

被引:0
|
作者
Sarathy, K. Partha [1 ]
Ramakrishnan, A. G. [2 ]
机构
[1] Ctr Dev Telemat, Bangalore 560100, Karnataka, India
[2] Indian Inst Sci, Dept Elect Engn, Bangalore 560100, Karnataka, India
关键词
speech synthesis; speech codecs; intelligibility; naturalness; perception;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper describes a modular, unit selection based TTS framework, which can be used as a research bed for developing TTS in any new language, as well as studying the effect of changing any parameter during synthesis. Using this framework, TTS has been developed for Tamil. Synthesis database consists of 1027 phonetically rich prerecorded sentences. This framework has already been tested for Kannada. Our TTS synthesizes intelligible and acceptably natural speech, as supported by high mean opinion scores. The framework is further optimized to suit embedded applications like mobiles and PDAs. We compressed the synthesis speech database with standard speech compression algorithms used in commercial GSM phones and evaluated the quality of the resultant synthesized sentences. Even with a highly compressed database, the synthesized output is perceptually close to that with uncompressed database. Through experiments, we explored the ambiguities in human perception when listening to Tamil phones and syllables uttered in isolation, thus proposing to exploit the misperception to substitute for missing phone contexts in the database. Listening experiments have been conducted on sentences synthesized by deliberately replacing phones with their confused ones.
引用
收藏
页码:229 / +
页数:2
相关论文
共 50 条
  • [41] GRADIENT-DESCENT BASED UNIT-SELECTION OPTIMIZATION ALGORITHM USED FOR CORPUS-BASED TEXT-TO-SPEECH SYNTHESIS
    Rojc, Matej
    Kacic, Zdravko
    APPLIED ARTIFICIAL INTELLIGENCE, 2011, 25 (07) : 635 - 668
  • [42] Assessing a Speaker for Fast Speech in Unit Selection Speech Synthesis
    Moers, Donata
    Wagner, Petra
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2015 - +
  • [43] Implementation and verification of speech database for unit selection speech synthesis
    Szklanny, Krzysztof
    Koszuta, Sebastian
    PROCEEDINGS OF THE 2017 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2017, : 1263 - 1267
  • [44] Unit Selection Model in Arabic Speech Synthesis
    Al-Saiyd, Nedhal A.
    Hijjawi, Mohammad
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (04): : 126 - 131
  • [45] Optimal Utterance Selection for Unit Selection Speech Synthesis Databases
    Alan W. Black
    Kevin Lenzo
    International Journal of Speech Technology, 2003, 6 (4) : 357 - 363
  • [46] RECENT IMPROVEMENTS OF PROBABILITY BASED PROSODY MODELS FOR UNIT SELECTION IN CONCATENATIVE TEXT-TO-SPEECH
    Zhang, Wei
    Gu, Liang
    Gao, Yuqing
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3777 - 3780
  • [47] SMALL FOOTPRINT HYBRID STATISTICAL/UNIT SELECTION TEXT-TO-SPEECH SYNTHESIS SYSTEM FOR AGGLUTINATIVE LANGUAGES
    Guner, Ekrem
    Demiroglu, Cenk
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4537 - 4540
  • [48] Extracting user preferences by GTM for aiGA weight tuning in unit selection text-to-speech synthesis
    Formiga, Lluis
    Alias, Francese
    COMPUTATIONAL AND AMBIENT INTELLIGENCE, 2007, 4507 : 654 - +
  • [49] Learning and Modeling Unit Embeddings for Improving HMM-based Unit Selection Speech Synthesis
    Zhou, Xiao
    Ling, Zhen-Hua
    Zhou, Zhi-Ping
    Dai, Li-Rong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2509 - 2513
  • [50] Improving Speech Recognition using GAN-based Speech Synthesis and Contrastive Unspoken Text Selection
    Chen, Zhehuai
    Rosenberg, Andrew
    Zhang, Yu
    Wang, Gary
    Ramabhadran, Bhuvana
    Moreno, Pedro J.
    INTERSPEECH 2020, 2020, : 556 - 560