A SPECTRO-TEMPORAL TECHNIQUE FOR ESTIMATING APERIODICITY AND VOICED/UNVOICED DECISION BOUNDARIES OF SPEECH SIGNALS

被引：0

作者：

Dhiman, Jitendra Kumar ^{[1
]}

Seelamantula, Chandra Sekhar ^{[1
]}

机构：

[1] Indian Inst Sci, Dept Elect Engn, Bangalore 560012, Karnataka, India

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年

关键词：

aperiodicity in 2-D; band-wise aperiodicy parameters; carrier spectrogram; coherence map; DEMODULATION; SYSTEM;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In contrast to a 1-D short-time analysis of speech, 2-D approaches aim at characterizing the speech signal attributes jointly in time and frequency. In this paper, we focus on the quasi-periodicity of a voiced spectro-temporal patch and quantify it by proposing an aperiodicity measure defined using the underlying frequency modulations in the patch. We further propose a time-frequency aperiodicity map obtained by overlapping and adding the aperiodicity measures across patches. The proposed aperiodicity map is utilized to obtain band-wise aperiodicity parameters, which are essential for high-quality speech synthesis. The aperiodicity in unvoiced patches is addressed by identifying them using the coherence of the patch. In addition, the proposed technique also provides voiced/unvoiced decisions boundaries of a speech signal. The effectiveness of the proposed band-wise aperiodicity parameters and voiced/unvoiced decisions is verified by incorporating them in an existing state-of-the-art vocoder for speech synthesis. Subjective listening tests show that the quality of the reconstructed speech is on par with that of the state-of-the-art WORLD vocoder in terms of mean opinion score, indicating that spectrotemporal approaches are highly promising for speech analysis and synthesis applications.

引用

页码：6510 / 6514

页数：5

共 50 条

[1] Aging and Spectro-Temporal Integration of Speech
Grose, John H.
Porter, Heather L.
Buss, Emily
TRENDS IN HEARING, 2016, 20
[2] Voiced/Unvoiced Decision for Speech Signals Based on Zero-Crossing Rate and Energy
Bachu, R. G.
Kopparthi, S.
Adapa, B.
Barkana, B. D.
ADVANCES TECHNIQUES IN COMPUTING SCIENCES AND SOFTWARE ENGINEERING, 2010, : 279 - 282
[3] A multifeature voiced/unvoiced decision algorithm for noisy speech
Shahnaz, C.
Zhu, W. -P.
Ahmad, M. O.
2006 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, PROCEEDINGS, 2006, : 2525 - +
[4] Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling
Karjigi, V.
Rao, P.
SPEECH COMMUNICATION, 2012, 54 (10) : 1104 - 1120
[5] Proposed a new approach for voiced / Unvoiced decision of speech file using Lagrange technique
Hassan, N.F. (nidaaalalousi@yahoo.com), 1600, Begell House Inc. (72):
[6] Development of spectro-temporal features of speech in children
Gautam S.
Singh L.
Gautam, Sumanlata (suman.gautam82@gmail.com), 1600, Springer Science and Business Media, LLC (20): : 543 - 551
[7] SPECTRO-TEMPORAL NEURAL FACTORIZATION FOR SPEECH DEREVERBERATION
Chien, Jen-Tzung
Kuo, Kuan-Ting
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5449 - 5453
[8] Modeling of Voiced and Unvoiced Speech Signals Using Fractional Calculus
Abraham, Anju
Harsha, A.
2016 INTERNATIONAL CONFERENCE ON NEXT GENERATION INTELLIGENT SYSTEMS (ICNGIS), 2016, : 35 - 39
[9] A pitch determination and voiced/unvoiced decision algorithm for noisy speech
Rouat, J
Liu, YC
Morissette, D
SPEECH COMMUNICATION, 1997, 21 (03) : 191 - 207
[10] Two-speaker Voiced/Unvoiced Decision for Monaural Speech
Zeremdini, Jihen
Ben Messaoud, Mohamed Anouar
Bouzid, Aicha
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2020, 39 (09) : 4399 - 4415

← 1 2 3 4 5 →