Long-term flexible 2D cepstral modeling of speech spectral amplitudes

被引：1

作者：

Firouzmand, Mohammad ^{[1
]}

Girin, Laurent ^{[1
]}

机构：

[1] INPG, Grenoble Lab Images Speech Signal & Automat, Grenoble, France

来源：

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年

关键词：

speech analysis; speech processing; speech coding; speech modeling; speech synthesis;

D O I：

10.1109/ICASSP.2008.4518515

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a method for modeling the envelope of spectral amplitude parameters of speech signals in "two dimensions" (2D). It consists of two cascaded modelings: the first one along the frequency axis is the usual cepstrum technique, which consists of modeling the log-scaled spectral envelope with a Discrete Cosine Model (DCM). The second one, along the time axis, consists of modeling the trajectory of the envelope DCM coefficients by another similar DCM model. An iterative algorithm is proposed to optimally fit this 2D-model to the data according to a perceptual criterion based on frequency masking. This approach is shown to provide an efficient and flexible representation of spectral amplitude parameters in terms of coefficient rates, while providing good signal quality, opening new perspectives in very-low bit-rate sinusoidal speech coding.

引用

页码：3937 / 3940

页数：4

共 50 条

[41] Long-term quantization of speech LSF parameters
Girin, Laurent
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 845 - 848
[42] Nonlinear long-term prediction of speech signal
Lee, KS
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2002, E85D (08) : 1346 - 1348
[43] Speech Pathologists in Long-Term Care PREFACE
Brush, Jennifer A.
SEMINARS IN SPEECH AND LANGUAGE, 2013, 34 (01) : 3 - 4
[44] Nonlinear long-term prediction of speech signals
Birgmeier, M
Bernhard, HP
Kubin, G
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1283 - 1286
[45] High durability and stability of 2D nanofluidic devices for long-term single-molecule sensing
Thakur, Mukeshchand
Cai, Nianduo
Zhang, Miao
Teng, Yunfei
Chernev, Andrey
Tripathi, Mukesh
Zhao, Yanfei
Macha, Michal
Elharouni, Farida
Lihter, Martina
Wen, Liping
Kis, Andras
Radenovic, Aleksandra
NPJ 2D MATERIALS AND APPLICATIONS, 2023, 7 (01)
[46] High durability and stability of 2D nanofluidic devices for long-term single-molecule sensing
Mukeshchand Thakur
Nianduo Cai
Miao Zhang
Yunfei Teng
Andrey Chernev
Mukesh Tripathi
Yanfei Zhao
Michal Macha
Farida Elharouni
Martina Lihter
Liping Wen
Andras Kis
Aleksandra Radenovic
npj 2D Materials and Applications, 7
[47] Long-Term Predictive Modelling of the Craniofacial Complex Using Machine Learning on 2D Cephalometric Radiographs
Myers, Michael
Brown, Michael D.
Badirli, Sarkhan
Eckert, George J.
Johnson, Diane Helen-Marie
Turkkahraman, Hakan
INTERNATIONAL DENTAL JOURNAL, 2025, 75 (01) : 236 - 247
[48] A FLUX FORMULATION OF THE SPECTRAL ATMOSPHERIC EQUATIONS SUITABLE FOR USE IN LONG-TERM CLIMATE MODELING
GORDON, HB
MONTHLY WEATHER REVIEW, 1981, 109 (01) : 56 - 64
[49] Application of spectral element method based on convolution filtering in long-term wavefield modeling
Ren JunSheng
Zhang Huai
Zhou YuanZe
Zhang Zhen
Shi YaoLin
CHINESE JOURNAL OF GEOPHYSICS-CHINESE EDITION, 2024, 67 (05): : 1832 - 1838
[50] Combining Short-term Cepstral and Long-term Pitch Features for Automatic Recognition of Speaker Age
Mueller, Christian
Burkhardt, Felix
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2268 - +

← 1 2 3 4 5 →