Anthropomorphic coding of speech and audio: A model inversion approach

被引:10
|
作者
Feldbauer, C [1 ]
Kubin, G
Kleijn, WB
机构
[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, A-8010 Graz, Austria
[2] Royal Inst Technol, KTH, Dept Signal Sensors & Syst, S-10044 Stockholm, Sweden
关键词
speech and audio coding; auditory representation; auditory model inversion; auditory synthesis; perceptual domain coding; multiple description coding;
D O I
10.1155/ASP.2005.1334
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel) coding.
引用
收藏
页码:1334 / 1349
页数:16
相关论文
共 50 条
  • [1] Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach
    Christian Feldbauer
    Gernot Kubin
    W. Bastiaan Kleijn
    EURASIP Journal on Advances in Signal Processing, 2005
  • [2] A new sinusoidal modelling approach for parametric speech and audio coding
    Vera-Candeas, P
    Ruiz-Reyes, N
    Curpián-Alonso, J
    Rosa-Zurera, M
    ISPA 2003: PROCEEDINGS OF THE 3RD INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS, PTS 1 AND 2, 2003, : 134 - 139
  • [3] Technologies for Speech and Audio Coding
    Moriya, Takehiro
    ISCE: 2009 IEEE 13TH INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS, VOLS 1 AND 2, 2009, : 20 - 21
  • [4] Special issue on anthropomorphic processing of audio and speech - Editorial
    Verhelst, W
    Herre, J
    Kubin, G
    Hermansky, H
    Jensen, SH
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2005, 2005 (09) : 1289 - 1291
  • [5] Joint speech/audio coding based scalable perceptual audio coding
    Gao, Li
    Hu, Ruimin
    Yang, Yuhong
    2014 IEEE/ACIS 13TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2014, : 419 - 424
  • [6] Advances in Speech and Audio Processing and Coding
    Spanias, Andreas
    2015 6TH INTERNATIONAL CONFERENCE ON INFORMATION, INTELLIGENCE, SYSTEMS AND APPLICATIONS (IISA), 2015,
  • [7] MPEG Unified Speech and Audio Coding
    Quackenbush, Schuyler
    IEEE MULTIMEDIA, 2013, 20 (02) : 72 - 78
  • [8] Combined speech and audio coding by discrimination
    Tancerel, L
    Ragot, S
    Ruoppila, VT
    Lefebvre, R
    2000 IEEE WORKSHOP ON SPEECH CODING, PROCEEDINGS: MEETING THE CHALLENGES OF THE NEW MILLENNIUM, 2000, : 154 - 156
  • [9] COMPARISON OF WINDOWING IN SPEECH AND AUDIO CODING
    Baeckstroem, Tom
    2013 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2013,
  • [10] Speech/audio coding technologies and their applications
    Kaneko, Takao, 2000, NTT, Tokyo, Japan (49):