Speech and audio coding using temporal masking

被引:0
|
作者
Gunawan, TS [1 ]
Ambikairajah, E [1 ]
Senn, D [1 ]
机构
[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia
关键词
temporal masking model; simultaneous masking model; Gammatone filters; wavelet packet; PESQ; subjective listening test;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a comparison of three auditory temporal masking models for speech and audio coding applications. The first model was developed based upon the existing forward masking psychoacoustic data with an assumption of ail approximately 200 ms. The model's dynamic parameters were derived from this data. The previously developed second model was,: based upon the principle of an exponential decay following higher energy stimuli, where the masking effects have a relatively short duration. The existing third model best matches the previously reported forward masking, data using ail exponential curve but the effects of the Forward masking are restricted to 100-200ms. Objective assessments employing the PESQ measure reveal that these three ternporal models have potential for removing perceptually redundant information in speech and audio coding, applications. Results show that the incorporation of temporal masking along with simultaneous masking into a speech/audio coding algorithm results in a further bit rate reduction of approximately 17% compared with simultaneous masking alone. while preserving perceptual quality.
引用
收藏
页码:31 / 42
页数:12
相关论文
共 50 条
  • [21] Speech/audio coding technologies and their applications
    Kaneko, Takao, 2000, NTT, Tokyo, Japan (49):
  • [22] A new psychoacoustical masking model for audio coding applications
    van de Par, S
    Kohlrausch, A
    Charestan, G
    Heusdens, R
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 1805 - 1808
  • [23] A new auditory masking model for speech and audio coders
    Sen, D
    Allen, JB
    1997 IEEE WORKSHOP ON SPEECH CODING FOR TELECOMMUNICATIONS, PROCEEDINGS: BACK TO BASICS: ATTACKING FUNDAMENTAL PROBLEMS IN SPEECH CODING, 1997, : 53 - 54
  • [24] Universal speech/audio coding using hybrid ACELP/TCX techniques
    Bessette, B
    Lefebvre, R
    Salami, R
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 301 - 304
  • [25] Combined speech and audio coding using non-linear adaptations
    Chan, CF
    1997 IEEE WORKSHOP ON SPEECH CODING FOR TELECOMMUNICATIONS, PROCEEDINGS: BACK TO BASICS: ATTACKING FUNDAMENTAL PROBLEMS IN SPEECH CODING, 1997, : 105 - 106
  • [26] Audio and Speech Compression using Sinusoidal Modeling and Wavelet Residuum Coding
    Nagy, Martin Turi
    Vargic, Radoslav
    PROCEEDINGS ELMAR-2012, 2012, : 207 - 210
  • [27] Postfiltering Using Log-Magnitude Spectrum for Speech and Audio Coding
    Das, Sneha
    Backstrom, Tom
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3543 - 3547
  • [28] Non-uniform speech/audio coding exploiting predictability of temporal evolution of spectral envelopes
    Motlicek, Petr
    Hermansky, Hynek
    Ganapathy, Sriram
    Garudadri, Harinath
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 350 - +
  • [29] CODING OF SPEECH AND WIDE-BAND AUDIO
    JAYANT, NS
    LAWRENCE, VB
    PREZAS, DP
    AT&T TECHNICAL JOURNAL, 1990, 69 (05): : 25 - 41
  • [30] Wideband speech and audio coding in the perceptual domain
    Lin, L
    Ambikairajah, E
    Holmes, WH
    ADVANCED SIGNAL PROCESSING FOR COMMUNICATION SYSTEMS, 2002, 703 : 15 - 30