Long-term flexible 2D cepstral modeling of speech spectral amplitudes

被引:1
|
作者
Firouzmand, Mohammad [1 ]
Girin, Laurent [1 ]
机构
[1] INPG, Grenoble Lab Images Speech Signal & Automat, Grenoble, France
关键词
speech analysis; speech processing; speech coding; speech modeling; speech synthesis;
D O I
10.1109/ICASSP.2008.4518515
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a method for modeling the envelope of spectral amplitude parameters of speech signals in "two dimensions" (2D). It consists of two cascaded modelings: the first one along the frequency axis is the usual cepstrum technique, which consists of modeling the log-scaled spectral envelope with a Discrete Cosine Model (DCM). The second one, along the time axis, consists of modeling the trajectory of the envelope DCM coefficients by another similar DCM model. An iterative algorithm is proposed to optimally fit this 2D-model to the data according to a perceptual criterion based on frequency masking. This approach is shown to provide an efficient and flexible representation of spectral amplitude parameters in terms of coefficient rates, while providing good signal quality, opening new perspectives in very-low bit-rate sinusoidal speech coding.
引用
收藏
页码:3937 / 3940
页数:4
相关论文
共 50 条
  • [1] Cepstral and Long-Term Features for Emotion Recognition
    Dumouchel, Pierre
    Dehak, Najim
    Attabi, Yazid
    Dehak, Reda
    Boufaden, Narjes
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 344 - +
  • [2] Long-term average spectral characteristics of Cantonese alaryngeal speech
    Ng, Manwa L.
    Liu, Hanjun
    Zhao, Qin
    Lam, Paul K. Y.
    AURIS NASUS LARYNX, 2009, 36 (05) : 571 - 577
  • [3] Parametric modeling of a flexible 2D structure
    Mat, IZ
    Tokhi, MO
    JOURNAL OF LOW FREQUENCY NOISE VIBRATION AND ACTIVE CONTROL, 2004, 23 (02) : 115 - 131
  • [4] Adaptive roughness approach for 2D long-term morphodynamic simulation
    Klar, R.
    Umach, L.
    Achleitner, S.
    Aufleger, M.
    RIVER FLOW 2012, VOLS 1 AND 2, 2012, : 443 - 450
  • [5] Characterisation of the long-term reproducibility of 2D coordinate photomask calibration at the PTB
    Hässler-Grohne, W
    Bosse, H
    PRECISION ENGINEERING, NANOTECHNOLOGY, VOL. 2, 1999, : 446 - 449
  • [6] Long-Term Regularity of the Periodic Euler–Poisson System for Electrons in 2D
    Fan Zheng
    Communications in Mathematical Physics, 2019, 366 : 1135 - 1172
  • [7] Modeling and designing multilayer 2D perovskite / silicon bifacial tandem photovoltaics for high efficiencies and long-term stability
    Chung, Haejun
    Sun, Xingshu
    Mohite, Aditya D.
    Singh, Rahul
    Kumar, Lokendra
    Alam, Muhammad A.
    Bermel, Peter
    OPTICS EXPRESS, 2017, 25 (08): : A311 - A322
  • [8] Parallel Spectral and Cepstral Modeling Based Speech Enhancement Using Hidden Markov Model
    Prakash, Ram B.
    Selvi, Senthamizh R.
    Suresh, G. R.
    2014 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2014,
  • [9] Long-Term Simulation of Digestive Sound Signals by CEPSTRAL Technique
    Einalou, Z.
    Najafi, Z.
    Maghooli, K.
    Zandi, Y.
    Sheibeigi, A.
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 20, 2007, 20 : 342 - 346
  • [10] Perceptual long-term variable-rate sinusoidal modeling of speech
    Girin, Laurent
    Firouzmand, Mohammad
    Marchand, Sylvain
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 851 - 861