Multilingual Text-to-Speech Software Component for Dynamic Language Identification and Voice Switching

被引：0

作者：

Fogarassy-Neszly, Paul ^{[1
]}

Pribeanu, Costin ^{[2
]}

机构：

[1] BAUM Engn, 8 Str Traian Mosoiu, Arad 310175, Romania

[2] Natl Inst Res & Dev Informat ICI Bucharest, 8-10 Maresal Averescu Blvd, Bucharest 011455, Romania

来源：

STUDIES IN INFORMATICS AND CONTROL | 2016年 / 25卷 / 03期

关键词：

multilingual text-to-speech; dynamic language identification; voice switching; accessibility; assistive technologies; visually impaired users; usability;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Text-to-speech synthesis is a critical feature of the applications developed for people with visual or reading disabilities. In the last years there has been an increasing interest in multilingual text-to-speech synthesis, which requires multilingual text analysis and language specific speech synthesis. In this case, the dynamic switching of the synthetic voice is needed in order to enhance the usability and user experience. This paper aims at presenting a software component for multilingual text-to-speech synthesis. The software has been developed and tested in four steps: alpha version (proof-of-concept), functional version (beta), commercial version, and implementation. The beta testing results showed a high accuracy of the language detection algorithms, which perform properly on texts having a variable degree of fragmentation. The commercial version has been then successfully implemented in two applications for visually impaired people: an automatic reading machine and a personal organizer for the blind and visually impaired users. Both implementations have been tested with users for usability and acceptance. The evaluation results showed that a device with this component is easier to use by visually impaired people.

引用

页码：335 / 342

页数：8

共 50 条

[1] Multilingual text-to-speech synthesis
Black, AW
Lenzo, KA
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 761 - 764
[2] MULTILINGUAL TEXT-TO-SPEECH TRAINING USING CROSS LANGUAGE VOICE CONVERSION AND SELF-SUPERVISED LEARNING OF SPEECH REPRESENTATIONS
Wu, Jilong
Polyak, Adam
Taigman, Yaniv
Fong, Jason
Agrawal, Prabhav
He, Qing
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8017 - 8021
[3] Multilingual text analysis for text-to-speech synthesis
Bell Lab, Murray Hill, United States
International Conference on Spoken Language Processing, ICSLP, Proceedings, 1996, 3 : 1365 - 1368
[4] Software text-to-speech
Hallahan W.I.
International Journal of Speech Technology, 1997, 1 (2) : 121 - 134
[5] Multilingual text analysis for text-to-speech synthesis
Sproat, R
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1365 - 1368
[6] Text-to-Speech Software and Learning: Investigating the Relevancy of the Voice Effect
Craig, Scotty D.
Schroeder, Noah L.
JOURNAL OF EDUCATIONAL COMPUTING RESEARCH, 2019, 57 (06) : 1534 - 1548
[7] Text analysis and language identification for polyglot text-to-speech synthesis
Romsdorfer, Harald
Pfister, Beat
SPEECH COMMUNICATION, 2007, 49 (09) : 697 - 724
[8] Improving Multilingual Text-to-Speech with Mixture-of-Language-Experts and Accent Disentanglement
Wu, Jing
Chen, Ting
Chen, Minchuan
Hu, Wei
Wang, Shaojun
Xiao, Jing
INTERSPEECH 2024, 2024, : 4968 - 4972
[9] Towards a multilingual prosody model for text-to-speech
Jokisch, O
Ding, HW
Kruschke, H
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 421 - 424
[10] CONSIDERATIONS IN THE DESIGN OF A MULTILINGUAL TEXT-TO-SPEECH SYSTEM
BOVES, L
JOURNAL OF PHONETICS, 1991, 19 (01) : 25 - 36

← 1 2 3 4 5 →