Diction based prosody modeling in table-to-speech synthesis

被引:0
|
作者
Spiliotopoulos, D [1 ]
Xydas, G [1 ]
Kouroupetroglou, G [1 ]
机构
[1] Univ Athens, Dept Informat & Telecommun, GR-10679 Athens, Greece
来源
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS | 2005年 / 3658卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transferring a structure from the visual modality to the aural one presents a difficult challenge. In this work we are experimenting with prosody modeling for the synthesized speech representation of tabulated structures. This is achieved by analyzing naturally spoken descriptions of data tables and a following feedback by blind and sighted users. The derived prosodic phrase accent and pause break placement and values are examined in terms of successfully conveying semantically important visual information through prosody control in Table-to-Speech synthesis. Finally, the quality of the information provision of synthesized tables when utilizing the proposed prosody specification is studied against plain synthesis.
引用
收藏
页码:294 / 301
页数:8
相关论文
共 50 条
  • [21] Fine-grained prosody modeling in neural speech synthesis using ToBI representation
    Zou, Yuxiang
    Liu, Shichao
    Yin, Xiang
    Lin, Haopeng
    Wang, Chunfeng
    Zhang, Haoyu
    Ma, Zejun
    INTERSPEECH 2021, 2021, : 3146 - 3150
  • [22] MULTI-SPEAKER EMOTIONAL SPEECH SYNTHESIS WITH FINE-GRAINED PROSODY MODELING
    Lu, Chunhui
    Wen, Xue
    Liu, Ruolan
    Chen, Xiao
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5729 - 5733
  • [23] A statistical approach for modeling prosody features using POS tags for emotional speech synthesis
    Bulut, Murtaza
    Lee, Sungbok
    Narayanan, Shrikanth
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1237 - +
  • [24] FULLY-HIERARCHICAL FINE-GRAINED PROSODY MODELING FOR INTERPRETABLE SPEECH SYNTHESIS
    Sun, Guangzhi
    Zhang, Yu
    Weiss, Ron J.
    Cao, Yuan
    Zen, Heiga
    Wu, Yonghui
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6264 - 6268
  • [25] THE SYNTHEX SYSTEM - TREATMENT OF PROSODY IN SPEECH SYNTHESIS
    AGGOUN, A
    TSI-TECHNIQUE ET SCIENCE INFORMATIQUES, 1987, 6 (03): : 217 - 229
  • [26] Prosody modelling of Spanish for expressive speech synthesis
    Iriondo, Ignasi
    Socoro, Joan Claudi
    Alias, Francesc
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 821 - +
  • [27] WHITMAN AND SPEECH-BASED PROSODY
    JARVIS, DR
    WALT WHITMAN REVIEW, 1981, 27 (02): : 51 - 62
  • [28] Discourse Prosody and Its Application to Speech Synthesis
    Hu, Na
    Shao, Pengfei
    Zu, Yiqing
    Wang, Zuyan
    Huang, Wei
    Wang, Shijin
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [29] DiffProsody: Diffusion-Based Latent Prosody Generation for Expressive Speech Synthesis With Prosody Conditional Adversarial Training
    Oh, Hyung-Seok
    Lee, Sang-Hoon
    Lee, Seong-Whan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2654 - 2666
  • [30] Modeling arabic prosody for a text-to-speech system
    Boukadida, F.
    Ellouze, N.
    International Review on Computers and Software, 2009, 4 (03) : 337 - 343