Investigating accuracy of pitch-accent annotations in neural network-based

被引:0
|
作者
Luong, Hieu-Thi [1 ]
Wang, Xin [1 ]
Yamagishi, Junichi [1 ]
Nishizawa, Nobuyuki [2 ]
机构
[1] Natl Inst Informat, Tokyo, Japan
[2] KDDI Res Inc, Saitama, Japan
关键词
speech synthesis; deep neural network; Japanese prosody; WaveNet;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigated the impact of noisy linguistic features on the performance of a Japanese speech synthesis system based on neural network that uses WaveNet vocoder. We compared an ideal system that uses manually corrected linguistic features including phoneme and prosodic information in training and test sets against a few other systems that use corrupted linguistic features. Both subjective and objective results demonstrate that corrupted linguistic features, especially those in the test set, affected the ideal system's performance significantly in a statistical sense due to a mismatched condition between the training and test sets. Interestingly, while an utterance-level Turing test showed that listeners had a difficult time differentiating synthetic speech from natural speech, it further indicated that adding noise to the linguistic features in the training set can partially reduce the effect of the mismatch, regularize the model, and help the system perform better when linguistic features of the test set are noisy.
引用
收藏
页码:37 / 41
页数:5
相关论文
共 50 条
  • [41] A review of neural network-based gait recognition
    Pian, Jinxiang
    He, Tingyu
    Zhang, Shunchao
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 213 - 218
  • [42] Applications of Neural Network-Based AI in Cryptography
    Nitaj, Abderrahmane
    Rachidi, Tajjeeddine
    CRYPTOGRAPHY, 2023, 7 (03)
  • [43] Neural Network-Based Diagnostics for PV Plant
    Cristaldi, Loredana
    Leone, Giacomo
    Vergura, Silvano
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON ENVIRONMENT AND ELECTRICAL ENGINEERING (EEEIC), 2016,
  • [44] Neural network-based clustering for agriculture management
    Kadim Taşdemir
    Csaba Wirnhardt
    EURASIP Journal on Advances in Signal Processing, 2012
  • [45] Neural network-based model of photoresist reflow
    Chia, Charmaine
    Martis, Joel
    Jeffrey, Stefanie S.
    Howe, Roger T.
    JOURNAL OF VACUUM SCIENCE & TECHNOLOGY B, 2019, 37 (06):
  • [46] Artificial neural network-based performance assessments
    Stevens, R
    Ikeda, J
    Casillas, A
    Palacio-Cayetano, J
    Clyman, S
    COMPUTERS IN HUMAN BEHAVIOR, 1999, 15 (3-4) : 295 - 313
  • [47] Artificial neural network-based psychrometric predictor
    Mittal, GS
    Zhang, J
    BIOSYSTEMS ENGINEERING, 2003, 85 (03) : 283 - 289
  • [48] The study of neural network-based predictive validity
    Yu, JY
    Wu, HH
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2004, 39 (5-6) : 24 - 24
  • [49] Neural network-based intrusion detection systems
    Hu, LX
    He, ZJ
    COMPUTER SCIENCE AND TECHNOLOGY IN NEW CENTURY, 2001, : 296 - 298
  • [50] Adaptive Neural Network-based OFDM Receivers
    Fischer, Moritz Benedikt
    Doerner, Sebastian
    Cammerer, Sebastian
    Shimizu, Takayuki
    Lu, Hongsheng
    Ten Brink, Stephan
    2022 IEEE 23RD INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATION (SPAWC), 2022,