Multilingual Text-to-Speech Software Component for Dynamic Language Identification and Voice Switching

被引:0
|
作者
Fogarassy-Neszly, Paul [1 ]
Pribeanu, Costin [2 ]
机构
[1] BAUM Engn, 8 Str Traian Mosoiu, Arad 310175, Romania
[2] Natl Inst Res & Dev Informat ICI Bucharest, 8-10 Maresal Averescu Blvd, Bucharest 011455, Romania
来源
STUDIES IN INFORMATICS AND CONTROL | 2016年 / 25卷 / 03期
关键词
multilingual text-to-speech; dynamic language identification; voice switching; accessibility; assistive technologies; visually impaired users; usability;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text-to-speech synthesis is a critical feature of the applications developed for people with visual or reading disabilities. In the last years there has been an increasing interest in multilingual text-to-speech synthesis, which requires multilingual text analysis and language specific speech synthesis. In this case, the dynamic switching of the synthetic voice is needed in order to enhance the usability and user experience. This paper aims at presenting a software component for multilingual text-to-speech synthesis. The software has been developed and tested in four steps: alpha version (proof-of-concept), functional version (beta), commercial version, and implementation. The beta testing results showed a high accuracy of the language detection algorithms, which perform properly on texts having a variable degree of fragmentation. The commercial version has been then successfully implemented in two applications for visually impaired people: an automatic reading machine and a personal organizer for the blind and visually impaired users. Both implementations have been tested with users for usability and acceptance. The evaluation results showed that a device with this component is easier to use by visually impaired people.
引用
收藏
页码:335 / 342
页数:8
相关论文
共 50 条
  • [1] Multilingual text-to-speech synthesis
    Black, AW
    Lenzo, KA
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 761 - 764
  • [2] MULTILINGUAL TEXT-TO-SPEECH TRAINING USING CROSS LANGUAGE VOICE CONVERSION AND SELF-SUPERVISED LEARNING OF SPEECH REPRESENTATIONS
    Wu, Jilong
    Polyak, Adam
    Taigman, Yaniv
    Fong, Jason
    Agrawal, Prabhav
    He, Qing
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8017 - 8021
  • [3] Multilingual text analysis for text-to-speech synthesis
    Bell Lab, Murray Hill, United States
    International Conference on Spoken Language Processing, ICSLP, Proceedings, 1996, 3 : 1365 - 1368
  • [4] Software text-to-speech
    Hallahan W.I.
    International Journal of Speech Technology, 1997, 1 (2) : 121 - 134
  • [5] Multilingual text analysis for text-to-speech synthesis
    Sproat, R
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1365 - 1368
  • [6] Text-to-Speech Software and Learning: Investigating the Relevancy of the Voice Effect
    Craig, Scotty D.
    Schroeder, Noah L.
    JOURNAL OF EDUCATIONAL COMPUTING RESEARCH, 2019, 57 (06) : 1534 - 1548
  • [7] Text analysis and language identification for polyglot text-to-speech synthesis
    Romsdorfer, Harald
    Pfister, Beat
    SPEECH COMMUNICATION, 2007, 49 (09) : 697 - 724
  • [8] Improving Multilingual Text-to-Speech with Mixture-of-Language-Experts and Accent Disentanglement
    Wu, Jing
    Chen, Ting
    Chen, Minchuan
    Hu, Wei
    Wang, Shaojun
    Xiao, Jing
    INTERSPEECH 2024, 2024, : 4968 - 4972
  • [9] Towards a multilingual prosody model for text-to-speech
    Jokisch, O
    Ding, HW
    Kruschke, H
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 421 - 424