Cross-Lingual Speaker Discrimination Using Natural and Synthetic Speech

被引:0
|
作者
Wester, Mirjam [1 ]
Liang, Hui [1 ]
机构
[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9YL, Midlothian, Scotland
关键词
speaker discrimination; speaker adaptation; HMM-based speech synthesis; ADAPTATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes speaker discrimination experiments in which native English listeners were presented with natural speech stimuli in English and Mandarin, synthetic speech stimuli in English and Mandarin, or natural Mandarin speech and synthetic English speech stimuli. In each experiment, listeners were asked to judge whether the sentences in a pair were spoken by the same person or not. We found that the results of Mandarin/English speaker discrimination were very similar to those found in previous work on German/English and Finnish/English speaker discrimination. We conclude from this and previous work that listeners are able to discriminate between speakers across languages or across speech types, but the combination of these two factors leads to a speaker discrimination task that is too difficult for listeners to perform successfully, given the fact that the quality of across-language speaker adapted speech synthesis at present still needs to be improved.
引用
收藏
页码:2492 / 2495
页数:4
相关论文
共 50 条
  • [21] Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis
    Dines, John
    Liang, Hui
    Saheer, Lakshmi
    Gibson, Matthew
    Byrne, William
    Oura, Keiichiro
    Tokuda, Keiichi
    Yamagishi, Junichi
    King, Simon
    Wester, Mirjam
    Hirsimaki, Teemu
    Karhila, Reima
    Kurimo, Mikko
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (02): : 420 - 437
  • [22] Cross-lingual Speaker Verification with Deep Feature Learning
    Li, Lantian
    Wang, Dong
    Rozi, Askar
    Zheng, Thomas Fang
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1040 - 1044
  • [23] CROSS-LINGUAL SPEAKER VERIFICATION BASED ON LINEAR TRANSFORM
    Askar, Rozi
    Wang, Dong
    Bie, Fanhu
    Wang, Jun
    Zheng, Thomas Fang
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 519 - 523
  • [24] Cross-Lingual Speech-to-Text Summarization
    Pontes, Elvys Linhares
    Gonzalez-Gallardo, Carlos-Emiliano
    Torres-Moreno, Juan-Manuel
    Huet, Stephane
    MULTIMEDIA AND NETWORK INFORMATION SYSTEMS, 2019, 833 : 385 - 395
  • [25] Speech Emotion Recognition with Cross-lingual Databases
    Chiou, Bo-Chang
    Chen, Chia-Ping
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 558 - 561
  • [26] On the Robustness of Cross-lingual Speaker Recognition using Transformer-based Approaches
    Liao, Wen-Hung
    Chen, Wei-Yu
    Wu, Yi-Chieh
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 366 - 371
  • [27] ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion
    Casanova, Edresson
    Shulby, Christopher
    Korolev, Alexander
    Candido Junior, Arnaldo
    Soares, Anderson da Silva
    Aluisio, Sandra
    Ponti, Moacir Antonelli
    INTERSPEECH 2023, 2023, : 1244 - 1248
  • [28] IMPROVING LUXEMBOURGISH SPEECH RECOGNITION WITH CROSS-LINGUAL SPEECH REPRESENTATIONS
    Le Minh Nguyen
    Nayak, Shekhar
    Coler, Matt
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 792 - 797
  • [29] Synthetic Treebanking for Cross-Lingual Dependency Parsing
    Tiedemann, Joerg
    Agic, Zeljko
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2016, 55 : 209 - 248
  • [30] Exploring Cross-lingual Singing Voice Synthesis Using Speech Data
    Cao, Yuewen
    Liu, Songxiang
    Kang, Shiyin
    Hu, Na
    Liu, Peng
    Liu, Xunying
    Su, Dan
    Yu, Dong
    Meng, Helen
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,