Direct F0 Estimation with Neural-Network-based Regression

被引:5
|
作者
Xu, Shuzhuang [1 ]
Shimodaira, Hiroshi [2 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
[2] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh, Midlothian, Scotland
来源
关键词
fundamental frequency; pitch tracking; neural network; PITCH; TRACKING;
D O I
10.21437/Interspeech.2019-3267
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Pitch tracking, or the continuous extraction of fundamental frequency from speech waveforms, is of vital importance to many applications in speech analysis and synthesis. Many existing trackers, including conventional ones such as Praat, RAPT and YIN, and newly proposed neural-network-based ones such as DNN-CLS, CREPE and RNN-REG, have conducted an extensive investigation into speech pitch tracking. This work developed a different end-to-end regression model based on neural networks, where a voice detector and a newly proposed value estimator work jointly to highlight the trajectory of fundamental frequency. Experiments on the PTDB-TUG corpus showed that the system surpasses canonical neural networks in terms of gross error rate. It further outperformed conventional trackers under clean condition and neural-network classifiers under noisy condition by the NOISEX-92 corpus.
引用
收藏
页码:1995 / 1999
页数:5
相关论文
共 50 条
  • [41] Effects of F0 Estimation Algorithms on Ultrasound- Based Silent Speech Interfaces
    Dai, Pengyu
    Al-Radhi, Mohammed Salah
    Csapo, Tamas Gabor
    2021 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2021, : 47 - 51
  • [42] Hear Your Face: Face-based voice conversion with F0 estimation
    Lee, Jaejun
    Oh, Yoori
    Hwang, Injune
    Lee, Kyogu
    INTERSPEECH 2024, 2024, : 4378 - 4382
  • [43] Neural-Network-Based Estimation Method for Ultraviolet Scattering Channel Under Turbulence
    Zhao Taifei
    Lu Xinzhe
    Sun Yuxin
    Zhang Shuang
    ACTA OPTICA SINICA, 2021, 41 (24)
  • [44] Neural-Network-Based DOA Estimation in the Presence of Non-Gaussian Interference
    Feintuch, Stefan
    Tabrikian, Joseph
    Bilik, Igal
    Permuter, Haim
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2024, 60 (01) : 119 - 132
  • [45] Neural-network-based real-time human body posture estimation
    Takahashi, K
    Uemura, T
    Ohya, J
    NEURAL NETWORKS FOR SIGNAL PROCESSING X, VOLS 1 AND 2, PROCEEDINGS, 2000, : 477 - 486
  • [46] Neural-network-based regression model of ground surface settlement induced by deep excavation
    Leu, SS
    Lo, HC
    AUTOMATION IN CONSTRUCTION, 2004, 13 (03) : 279 - 289
  • [47] Neural-network-based real-time human body posture estimation
    Takahashi, Kazuhiko
    Uemura, Tetsuya
    Ohya, Jun
    Neural Networks for Signal Processing - Proceedings of the IEEE Workshop, 2000, 2 : 477 - 486
  • [48] On Evaluation of the F0 estimation based on time-varying complex speech analysis
    Funaki, Keiichi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 637 - 640
  • [49] Neural-Network-Based Optimal Mode Estimation for Adaptive Affine Motion Compensation
    Kitamura, Takahiro
    Yoshida, Toshiyuki
    INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2021, 2021, 11766
  • [50] Neural-Network-Based Estimation Method for Ultraviolet Scattering Channel Under Turbulence
    Zhao T.
    Lü X.
    Sun Y.
    Zhang S.
    Guangxue Xuebao/Acta Optica Sinica, 2021, 41 (24):