Investigating accuracy of pitch-accent annotations in neural network-based

被引:0
|
作者
Luong, Hieu-Thi [1 ]
Wang, Xin [1 ]
Yamagishi, Junichi [1 ]
Nishizawa, Nobuyuki [2 ]
机构
[1] Natl Inst Informat, Tokyo, Japan
[2] KDDI Res Inc, Saitama, Japan
关键词
speech synthesis; deep neural network; Japanese prosody; WaveNet;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigated the impact of noisy linguistic features on the performance of a Japanese speech synthesis system based on neural network that uses WaveNet vocoder. We compared an ideal system that uses manually corrected linguistic features including phoneme and prosodic information in training and test sets against a few other systems that use corrupted linguistic features. Both subjective and objective results demonstrate that corrupted linguistic features, especially those in the test set, affected the ideal system's performance significantly in a statistical sense due to a mismatched condition between the training and test sets. Interestingly, while an utterance-level Turing test showed that listeners had a difficult time differentiating synthetic speech from natural speech, it further indicated that adding noise to the linguistic features in the training set can partially reduce the effect of the mismatch, regularize the model, and help the system perform better when linguistic features of the test set are noisy.
引用
收藏
页码:37 / 41
页数:5
相关论文
共 50 条
  • [21] Investigating Geometry-Aware Network-Based Positioning in Cellular Networks Using Neural Network Predictive Model
    Dahunsi, Folasade Mojisola
    Dwolatzky, Barry
    WIRELESS PERSONAL COMMUNICATIONS, 2016, 90 (03) : 1413 - 1432
  • [22] Investigating Geometry-Aware Network-Based Positioning in Cellular Networks Using Neural Network Predictive Model
    Folasade Mojisola Dahunsi
    Barry Dwolatzky
    Wireless Personal Communications, 2016, 90 : 1413 - 1432
  • [23] Investigating future projection of precipitation over Iraq using artificial neural network-based downscaling
    Ibrahim, Wlat abdulqader
    Gumus, Veysel
    Seker, Mehmet
    ITALIAN JOURNAL OF AGROMETEOROLOGY-RIVISTA ITALIANA DI AGROMETEOROLOGIA, 2023, (02): : 79 - 94
  • [24] Automatic Recognition of Pitch Accent Using Distributed Time-Delay Recursive Neural Network
    Kim, Sung-Suk
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2006, 25 (06): : 277 - 281
  • [25] Back-propagation learning in improving the accuracy of neural network-based tourism demand forecasting
    Law, R
    TOURISM MANAGEMENT, 2000, 21 (04) : 331 - 340
  • [26] Improving Pneumonia Diagnosis Accuracy via Systematic Convolutional Neural Network-Based Image Enhancement
    Wang, Ziqi
    Hall, Justin
    Haddad, Rami J.
    SOUTHEASTCON 2021, 2021, : 447 - 452
  • [27] Improved accuracy of anticoagulant dose prediction using a pharmacogenetic and artificial neural network-based method
    Isma'eel, Hussain A.
    Sakr, George E.
    Habib, Robert H.
    Almedawar, Mohamad Musbah
    Zgheib, Nathalie K.
    Elhajj, Imad H.
    EUROPEAN JOURNAL OF CLINICAL PHARMACOLOGY, 2014, 70 (03) : 265 - 273
  • [28] Improved accuracy of anticoagulant dose prediction using a pharmacogenetic and artificial neural network-based method
    Hussain A. Isma’eel
    George E. Sakr
    Robert H. Habib
    Mohamad Musbah Almedawar
    Nathalie K. Zgheib
    Imad H. Elhajj
    European Journal of Clinical Pharmacology, 2014, 70 : 265 - 273
  • [29] A new network-based approach to investigating neurological disorders
    Cauteruccio, Francesco
    Lo Giudice, Paolo
    Terracina, Giorgio
    Ursino, Domenico
    Mammone, Nadia
    Morabito, Francesco Carlo
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2019, 11 (04) : 315 - 349
  • [30] Identifying and investigating the "best" schools: a network-based analysis
    Joshi, Priyadarshani
    COMPARE-A JOURNAL OF COMPARATIVE AND INTERNATIONAL EDUCATION, 2018, 48 (01) : 110 - 127