High-Fidelity and Pitch-Controllable Neural Vocoder Based on Unified Source-Filter Networks

被引:2
|
作者
Yoneyama, Reo [1 ]
Wu, Yi-Chiao [1 ]
Toda, Tomoki [2 ]
机构
[1] Nagoya Univ, Grad Sch Informat, Nagoya 4648601, Japan
[2] Nagoya Univ, Informat Technol Ctr, Nagoya 4648601, Japan
基金
日本学术振兴会;
关键词
Vocoders; Controllability; Speech processing; Neural networks; Training; Mathematical models; Acoustics; Speech synthesis; neural vocoder; source-filter model; unified source-filter networks; WAVE-FORM GENERATION; SPEECH SYNTHESIS; MODEL;
D O I
10.1109/TASLP.2023.3313410
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We introduce unified source-filter generative adversarial networks (uSFGAN), a waveform generative model conditioned on acoustic features, which represents the source-filter architecture in a generator network. Unlike the previous neural-based source-filter models in which parametric signal process modules are combined with neural networks, our approach enables unified optimization of both the source excitation generation and resonance filtering parts to achieve higher sound quality. In the uSFGAN framework, several specific regularization losses are proposed to enable the source excitation generation part to output reasonable source excitation signals. Both objective and subjective experiments are conducted, and the results demonstrate that the proposed uSFGAN achieves comparable sound quality to HiFi-GAN in the speech reconstruction task and outperforms WORLD in the F-0 transformation task. Moreover, we argue that the F-0-driven mechanism and the inductive bias obtained by source-filter modeling improve the robustness against unseen F-0 in training as shown by the results of experimental evaluations. Audio samples are available at our demo site at https://chomeyama.github.io/PitchControllableNeuralVocoder-Demo/.
引用
收藏
页码:3717 / 3729
页数:13
相关论文
共 50 条
  • [21] Accurate and rapid predictions with explainable graph neural networks for small high-fidelity bandgap datasets
    Xiao, Jianping
    Yang, Li
    Wang, Shuqun
    MODELLING AND SIMULATION IN MATERIALS SCIENCE AND ENGINEERING, 2024, 32 (03)
  • [22] High-fidelity Frontend Based on XPW Filter for High-contrast Few-cycle OPCPAs
    Jullien, A.
    Ricci, A.
    Chen, X.
    Rousseau, J. P.
    Lopez-Martens, R.
    Ramirez, L. P.
    Papadopoulos, D.
    Pellegrina, A.
    Georges, P.
    Druon, F.
    2011 CONFERENCE ON LASERS AND ELECTRO-OPTICS (CLEO), 2011,
  • [23] Spectral prediction method based on the transformer neural network for high-fidelity color reproduction
    Li, Huailin
    Zheng, Yingying
    Liu, Qinsen
    Sun, Bangyong
    OPTICS EXPRESS, 2024, 32 (17): : 30481 - 30499
  • [24] High-Fidelity Accelerated MRI Reconstruction by Scan-Specific Fine-Tuning of Physics-Based Neural Networks
    Hosseini, Seyed Amir Hossein
    Yaman, Burhaneddin
    Moeller, Steen
    Akcakaya, Mehmet
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 1481 - 1484
  • [25] A High-Fidelity and Computationally Efficient Model for an Electrically Excited Synchronous Generator Based on Current-Flux Linkage Neural Networks
    Du, Haoran
    Liu, Yongzhi
    Li, Tianxing
    Zhu, Peirong
    JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY, 2024, 19 (05) : 2903 - 2918
  • [26] High-fidelity single-photon source based on a Type II optical parametric oscillator
    Morin, Olivier
    D'Auria, Virginia
    Fabre, Claude
    Laurat, Julien
    OPTICS LETTERS, 2012, 37 (17) : 3738 - 3740
  • [27] Automatic Classification of Healthy Subjects and Patients With Essential Vocal Tremor Using Probabilistic Source-Filter Model Based Noise Robust Pitch Estimation
    Rao, M. V. Achuth
    Yamini, B. K.
    Ketan, J.
    Shetty, A. Preetie
    Pal, Pramod Kumar
    Shivashankar, N.
    Ghosh, Prasanta Kumar
    JOURNAL OF VOICE, 2023, 37 (03) : 314 - 321
  • [28] Mesh Based Neural Networks for Estimating High Fidelity CFD from Low Fidelity Input
    Joseph, Nikita Susan
    Banerjee, Chaity
    Reasor, Daniel A., Jr.
    Pasiliao, Eduardo
    Mukherjee, Tathagata
    SOUTHEASTCON 2022, 2022, : 565 - 574
  • [29] Design of Complementary Filter for High-fidelity Attitude Estimation based on Sensor Dynamics Compensation with Decoupled Properties
    Masuya, Ken
    Sugihara, Tomomichi
    Yamamoto, Motoji
    2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2012, : 606 - 611
  • [30] A Pneumatic Low-Pass Filter for High-Fidelity Cuff-Based Pulse Waveform Acquisition
    Tamborini, Alessio
    Gharib, Morteza
    ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (11) : 2617 - 2628