HIGH-FIDELITY NEURAL PHONETIC POSTERIORGRAMS

被引:0
|
作者
Churchwell, Cameron [1 ]
Morrison, Max [1 ]
Pardo, Bryan [1 ]
机构
[1] Northwestern Univ, Evanston, IL 60208 USA
关键词
interpretable; ppg; pronunciation; representation;
D O I
10.1109/ICASSPW62465.2024.10669905
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A phonetic posteriorgram (PPG) is a time-varying categorical distribution over acoustic units of speech (e.g., phonemes). PPGs are a popular representation in speech generation due to their ability to disentangle pronunciation features from speaker identity, allowing accurate reconstruction of pronunciation (e.g., voice conversion) and coarse-grained pronunciation editing (e.g., foreign accent conversion). In this paper, we demonstrably improve the quality of PPGs to produce a state-of-the-art interpretable PPG representation. We train an off-the-shelf speech synthesizer using our PPG representation and show that high-quality PPGs yield independent control over pitch and pronunciation. We further demonstrate novel uses of PPGs, such as an acoustic pronunciation distance and fine-grained pronunciation control.
引用
收藏
页码:823 / 827
页数:5
相关论文
共 50 条
  • [1] Neuralangelo: High-Fidelity Neural Surface Reconstruction
    Li, Zhaoshuo
    Muller, Thomas
    Evans, Alex
    Tayloi, Russell H.
    Unberath, Mathias
    Liu, Ming -Yu
    Lin, Chen-Hsuan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 8456 - 8465
  • [2] ACCENT CONVERSION USING PHONETIC POSTERIORGRAMS
    Zhao, Guanlong
    Sonsaat, Sinem
    Levis, John
    Chukharev-Hudilainen, Evgeny
    Gutierrez-Osuna, Ricardo
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5314 - 5318
  • [3] HILCodec: High-Fidelity and Lightweight Neural Audio Codec
    Ahn, Sunghwan
    Woo, Beom Jun
    Han, Min Hyun
    Moon, Chanyeong
    Kim, Nam Soo
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2024, 18 (08) : 1517 - 1530
  • [4] HIGH-FIDELITY HEADPHONES
    Anderson, L. J.
    JOURNAL OF THE SOCIETY OF MOTION PICTURE ENGINEERS, 1941, 37 (03): : 319 - 323
  • [5] High-fidelity nucleases
    Rusk, Nicole
    NATURE METHODS, 2019, 16 (10) : 958 - 958
  • [6] High-Fidelity Educators
    Kardong-Edgren, Suzan Suzie
    CLINICAL SIMULATION IN NURSING, 2016, 12 (12) : 529 - 529
  • [7] HIGH-FIDELITY TESTING
    KHOL, R
    MACHINE DESIGN, 1969, 41 (15) : 107 - &
  • [8] HIGH-FIDELITY DEER
    PORTER, WF
    NATURAL HISTORY, 1992, (05) : 48 - 49
  • [9] High-fidelity nucleases
    Nicole Rusk
    Nature Methods, 2019, 16 : 958 - 958
  • [10] HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion
    Isik, Mustafa
    Ruenz, Martin
    Georgopoulos, Markos
    Khakhulin, Taras
    Starck, Jonathan
    Agapito, Lourdes
    Niessner, Matthias
    ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (04):