A SPECTRO-TEMPORAL TECHNIQUE FOR ESTIMATING APERIODICITY AND VOICED/UNVOICED DECISION BOUNDARIES OF SPEECH SIGNALS

被引:0
|
作者
Dhiman, Jitendra Kumar [1 ]
Seelamantula, Chandra Sekhar [1 ]
机构
[1] Indian Inst Sci, Dept Elect Engn, Bangalore 560012, Karnataka, India
关键词
aperiodicity in 2-D; band-wise aperiodicy parameters; carrier spectrogram; coherence map; DEMODULATION; SYSTEM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In contrast to a 1-D short-time analysis of speech, 2-D approaches aim at characterizing the speech signal attributes jointly in time and frequency. In this paper, we focus on the quasi-periodicity of a voiced spectro-temporal patch and quantify it by proposing an aperiodicity measure defined using the underlying frequency modulations in the patch. We further propose a time-frequency aperiodicity map obtained by overlapping and adding the aperiodicity measures across patches. The proposed aperiodicity map is utilized to obtain band-wise aperiodicity parameters, which are essential for high-quality speech synthesis. The aperiodicity in unvoiced patches is addressed by identifying them using the coherence of the patch. In addition, the proposed technique also provides voiced/unvoiced decisions boundaries of a speech signal. The effectiveness of the proposed band-wise aperiodicity parameters and voiced/unvoiced decisions is verified by incorporating them in an existing state-of-the-art vocoder for speech synthesis. Subjective listening tests show that the quality of the reconstructed speech is on par with that of the state-of-the-art WORLD vocoder in terms of mean opinion score, indicating that spectrotemporal approaches are highly promising for speech analysis and synthesis applications.
引用
收藏
页码:6510 / 6514
页数:5
相关论文
共 50 条
  • [31] Spectro-temporal modulation glimpsing for speech intelligibility prediction
    Edraki, Amin
    Chan, Wai-Yip
    Jensen, Jesper
    Fogerty, Daniel
    HEARING RESEARCH, 2022, 426
  • [32] Spectro-temporal modulation transfer functions and speech intelligibility
    Chi, TS
    Gao, YJ
    Guyton, MC
    Ru, PW
    Shamma, S
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1999, 106 (05): : 2719 - 2732
  • [33] Spectro-temporal weighting of interaural time differences in speech
    Baltzell, Lucas S.
    Cho, Adrian Y.
    Swaminathan, Jayaganesh
    Best, Virginia
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2020, 147 (06): : 3883 - 3894
  • [34] Speech discrimination based on multiscale spectro-temporal modulations
    Mesgarani, N
    Shamma, S
    Slaney, M
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 601 - 604
  • [35] NON-INTRUSIVE QUALITY ASSESSMENT FOR ENHANCED SPEECH SIGNALS BASED ON SPECTRO-TEMPORAL FEATURES
    Li, Qiaohong
    Fang, Yuming
    Lin, Weisi
    Thalmann, Daniel
    2014 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2014,
  • [36] Second generation wavelet transform-based pitch period estimation and voiced/unvoiced decision for speech signals
    Erçelebi, E
    APPLIED ACOUSTICS, 2003, 64 (01) : 25 - 41
  • [37] Estimating sparse spectro-temporal receptive fields with natural stimuli
    David, Stephen V.
    Mesgarani, Nima
    Shamma, Shihab A.
    NETWORK-COMPUTATION IN NEURAL SYSTEMS, 2007, 18 (03) : 191 - 212
  • [38] SPECTRO-TEMPORAL ANALYSIS OF SPEECH AFFECTED BY DEPRESSION AND PSYCHOMOTOR RETARDATION
    Cummins, Nicholas
    Epps, Julien
    Ambikairajah, Eliathamby
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7542 - 7546
  • [39] A spectro-temporal modulation index (STMI) for assessment of speech intelligibility
    Elhilali, M
    Chi, T
    Shamma, SA
    SPEECH COMMUNICATION, 2003, 41 (2-3) : 331 - 348
  • [40] Spectro-temporal processing of speech - An information-theoretic framework
    Christiansen, Thomas U.
    Dau, Torsten
    Greenberg, Steven
    HEARING - FROM SENSORY PROCESSING TO PERCEPTION, 2007, : 517 - 523