A SPECTRO-TEMPORAL TECHNIQUE FOR ESTIMATING APERIODICITY AND VOICED/UNVOICED DECISION BOUNDARIES OF SPEECH SIGNALS

被引:0
|
作者
Dhiman, Jitendra Kumar [1 ]
Seelamantula, Chandra Sekhar [1 ]
机构
[1] Indian Inst Sci, Dept Elect Engn, Bangalore 560012, Karnataka, India
关键词
aperiodicity in 2-D; band-wise aperiodicy parameters; carrier spectrogram; coherence map; DEMODULATION; SYSTEM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In contrast to a 1-D short-time analysis of speech, 2-D approaches aim at characterizing the speech signal attributes jointly in time and frequency. In this paper, we focus on the quasi-periodicity of a voiced spectro-temporal patch and quantify it by proposing an aperiodicity measure defined using the underlying frequency modulations in the patch. We further propose a time-frequency aperiodicity map obtained by overlapping and adding the aperiodicity measures across patches. The proposed aperiodicity map is utilized to obtain band-wise aperiodicity parameters, which are essential for high-quality speech synthesis. The aperiodicity in unvoiced patches is addressed by identifying them using the coherence of the patch. In addition, the proposed technique also provides voiced/unvoiced decisions boundaries of a speech signal. The effectiveness of the proposed band-wise aperiodicity parameters and voiced/unvoiced decisions is verified by incorporating them in an existing state-of-the-art vocoder for speech synthesis. Subjective listening tests show that the quality of the reconstructed speech is on par with that of the state-of-the-art WORLD vocoder in terms of mean opinion score, indicating that spectrotemporal approaches are highly promising for speech analysis and synthesis applications.
引用
收藏
页码:6510 / 6514
页数:5
相关论文
共 50 条
  • [1] Aging and Spectro-Temporal Integration of Speech
    Grose, John H.
    Porter, Heather L.
    Buss, Emily
    TRENDS IN HEARING, 2016, 20
  • [2] Voiced/Unvoiced Decision for Speech Signals Based on Zero-Crossing Rate and Energy
    Bachu, R. G.
    Kopparthi, S.
    Adapa, B.
    Barkana, B. D.
    ADVANCES TECHNIQUES IN COMPUTING SCIENCES AND SOFTWARE ENGINEERING, 2010, : 279 - 282
  • [3] A multifeature voiced/unvoiced decision algorithm for noisy speech
    Shahnaz, C.
    Zhu, W. -P.
    Ahmad, M. O.
    2006 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, PROCEEDINGS, 2006, : 2525 - +
  • [4] Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling
    Karjigi, V.
    Rao, P.
    SPEECH COMMUNICATION, 2012, 54 (10) : 1104 - 1120
  • [5] Proposed a new approach for voiced / Unvoiced decision of speech file using Lagrange technique
    Hassan, N.F. (nidaaalalousi@yahoo.com), 1600, Begell House Inc. (72):
  • [6] Development of spectro-temporal features of speech in children
    Gautam S.
    Singh L.
    Gautam, Sumanlata (suman.gautam82@gmail.com), 1600, Springer Science and Business Media, LLC (20): : 543 - 551
  • [7] SPECTRO-TEMPORAL NEURAL FACTORIZATION FOR SPEECH DEREVERBERATION
    Chien, Jen-Tzung
    Kuo, Kuan-Ting
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5449 - 5453
  • [8] Modeling of Voiced and Unvoiced Speech Signals Using Fractional Calculus
    Abraham, Anju
    Harsha, A.
    2016 INTERNATIONAL CONFERENCE ON NEXT GENERATION INTELLIGENT SYSTEMS (ICNGIS), 2016, : 35 - 39
  • [9] A pitch determination and voiced/unvoiced decision algorithm for noisy speech
    Rouat, J
    Liu, YC
    Morissette, D
    SPEECH COMMUNICATION, 1997, 21 (03) : 191 - 207
  • [10] Two-speaker Voiced/Unvoiced Decision for Monaural Speech
    Zeremdini, Jihen
    Ben Messaoud, Mohamed Anouar
    Bouzid, Aicha
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2020, 39 (09) : 4399 - 4415