Analysis and Synthesis of Speech Using an Adaptive Full-Band Harmonic Model

被引：35

作者：

Degottex, Gilles ^{[1
,2
]}

Stylianou, Yannis ^{[1
,2
]}

机构：

[1] Univ Crete, Dept Comp Sci, GR-71003 Iraklion, Greece

[2] FORTH, Inst Comp Sci, GR-70013 Iraklion, Greece

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2013年 / 21卷 / 10期

基金：

瑞士国家科学基金会;

关键词：

Voice model; sinusoidal model; harmonic model; non-stationary;

D O I：

10.1109/TASL.2013.2266772

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Voice models often use frequency limits to split the speech spectrum into two or more voiced/unvoiced frequency bands. However, from the voice production, the amplitude spectrum of the voiced source decreases smoothly without any abrupt frequency limit. Accordingly, multiband models struggle to estimate these limits and, as a consequence, artifacts can degrade the perceived quality. Using a linear frequency basis adapted to the non-stationarities of the speech signal, the Fan Chirp Transformation (FChT) have demonstrated harmonicity at frequencies higher than usually observed from the DFT which motivates a full-band modeling. The previously proposed Adaptive Quasi-Harmonic model (aQHM) offers even more flexibility than the FChT by using a non-linear frequency basis. In the current paper, exploiting the properties of aQHM, we describe a full-band Adaptive Harmonic Model (aHM) along with detailed descriptions of its corresponding algorithms for the estimation of harmonics up to the Nyquist frequency. Formal listening tests show that the speech reconstructed using aHM is nearly indistinguishable from the original speech. Experiments with synthetic signals also show that the proposed aHM globally outperforms previous sinusoidal and harmonic models in terms of precision in estimating the sinusoidal parameters. As a perspective, such a precision is interesting for building higher level models upon the sinusoidal parameters, like spectral envelopes for speech synthesis.

引用

页码：2085 / 2095

页数：11

共 50 条

[21] Objective Quality Assessment of Echo-Impaired Full-Band Speech Signals
Avila, Flavio R.
Nunes, Leonardo O.
Biscainho, Luiz W. P.
Tygel, Alan F.
Lee, Bowon
2014 INTERNATIONAL TELECOMMUNICATIONS SYMPOSIUM (ITS), 2014,
[22] ANALYSIS/SYNTHESIS OF SPEECH BASED ON AN ADAPTIVE QUASI-HARMONIC PLUS NOISE MODEL
Pantazis, Yannis
Tzedakis, Georgios
Rosec, Olivier
Stylianou, Yannis
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4246 - 4249
[23] FULLSUBNET: A FULL-BAND AND SUB-BAND FUSION MODEL FOR REAL-TIME SINGLE-CHANNEL SPEECH ENHANCEMENT
Hao, Xiang
Su, Xiangdong
Horaud, Radu
Li, Xiaofei
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6633 - 6637
[24] Full-band Monte Carlo model for hole transport in silicon
Journal of Applied Physics, 1997, 81 (05):
[25] A Full Band adaptive Harmonic Model Based Speaker Identity Transformation using Radial Basis Function
Chadha, Ankita
Nirmal, Jagannath
PROCEEDINGS OF 2017 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO 2017), 2017, : 217 - 223
[26] A full-band Monte Carlo model for hole transport in silicon
Jallepalli, S
Rashed, M
Shih, WK
Maziar, CM
Tasch, AF
JOURNAL OF APPLIED PHYSICS, 1997, 81 (05) : 2250 - 2255
[27] On the Use of Absolute Threshold of Hearing-based Loss for Full-band Speech Enhancement
Mars, Rohith
Das, Rohan Kumar
2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 458 - 462
[28] MULTI-CHANNEL NARROW-BAND DEEP SPEECH SEPARATION WITH FULL-BAND PERMUTATION INVARIANT TRAINING
Quan, Changsheng
Li, Xiaofei
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 541 - 545
[29] FSI-Net: A dual-stage full- and sub-band integration network for full-band speech enhancement
Yu, Guochen
Wang, Hui
Li, Andong
Liu, Wenzhe
Zhang, Yuan
Wang, Yutian
Zheng, Chengshi
APPLIED ACOUSTICS, 2023, 211
[30] DEEPFILTERNET: A LOW COMPLEXITY SPEECH ENHANCEMENT FRAMEWORK FOR FULL-BAND AUDIO BASED ON DEEP FILTERING
Schroeter, Hendrik
Escalante-B, Alberto N.
Rosenkranz, Tobias
Maier, Andreas
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7407 - 7411

← 1 2 3 4 5 →