Improving the Accuracy and the Robustness of Harmonic Model for Pitch Estimation

被引:0
|
作者
Asgari, Meysam [1 ]
Shafran, Izhak [1 ]
机构
[1] Oregon Hlth & Sci Univ, Ctr Spoken Language Understanding, Portland, OR 97201 USA
来源
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年
基金
美国国家科学基金会;
关键词
fundamental frequency estimation; robust pitch estimation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accurate and robust estimation of pitch plays a central role in speech processing. Various methods in time, frequency and cepstral domain have been proposed for generating pitch candidates. Most algorithms excel when the background noise is minimal or for specific types of background noise. In this work, our aim is to improve the robustness and accuracy of pitch estimation across a wide variety of background noise conditions. For this we have chosen to adopt, the harmonic model of speech, a model that has gained considerable attention recently. We address two major weakness of this model. The problem of pitch halving and doubling, and the need to specify the number of harmonics. We exploit the energy of frequency in the neighborhood to alleviate halving and doubling. Using a model complexity term with a BIC criterion, we chose the optimal number of harmonics. We evaluated our proposed pitch estimation method with other state of the art techniques on Keele data set in terms of gross pitch error and fine pitch error. Through extensive experiments on several noisy conditions, we demonstrate that the proposed improvements provide substantial gains over other popular methods under different noise levels and environments.
引用
收藏
页码:1935 / 1939
页数:5
相关论文
共 50 条
  • [41] Fundamental frequency estimation based on pitch-scaled harmonic filtering
    Roa, Sergio
    Bennewitz, Maren
    Behnke, Sven
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 397 - +
  • [42] Pitch Estimation Using Harmonic Product Spectrum derived from DCT
    Sripriya, N.
    Nagarajan, T.
    2013 IEEE INTERNATIONAL CONFERENCE OF IEEE REGION 10 (TENCON), 2013,
  • [43] Partial magnitude rearrangement and harmonic relation confirmation for multiple pitch estimation
    Chen, Xuemei
    Liu, Ruolun
    IET SIGNAL PROCESSING, 2015, 9 (08) : 611 - 617
  • [44] Targeted Data Augmentation for Improving Model Robustness
    Mikolajczyk-Barela, Agnieszka
    Ferlin, Maria
    Grochowski, Michal
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2025, 35 (01) : 143 - 155
  • [45] Signal reshaping using dominant harmonic for pitch estimation of noisy speech
    Hasan, MK
    Hussain, S
    Setu, MTH
    Nazrul, MNI
    SIGNAL PROCESSING, 2006, 86 (05) : 1010 - 1018
  • [46] Improving pitch estimation for efficient multiband excitation coding of speech
    Chan, CF
    Yu, EWM
    ELECTRONICS LETTERS, 1996, 32 (10) : 870 - 872
  • [47] Improving Accuracy and Robustness of Self-Tuning Histograms by Subspace Clustering
    Khachatryan, Andranik
    Mueller, Emmanuel
    Boehm, Klemens
    Stier, Christian
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1544 - 1545
  • [48] Sensor fusion for improving the estimation of roll and pitch for an agricultural sprayer
    Khot, L. R.
    Tang, L.
    Steward, B. L.
    Han, S.
    BIOSYSTEMS ENGINEERING, 2008, 101 (01) : 13 - 20
  • [49] Improving pitch estimation for efficient multiband excitation coding of speech
    City Univ of Hong Kong, Kowloon, Hong Kong
    Electron Lett, 10 (870-872):
  • [50] Improving biometric recognition accuracy and robustness using DWT and SVM watermarking
    Vatsa, Mayank
    Singh, Richa
    Noore, Afzel
    IEICE ELECTRONICS EXPRESS, 2005, 2 (12): : 362 - 367