Speech and audio coding using temporal masking

被引：0

作者：

Gunawan, TS ^{[1
]}

Ambikairajah, E ^{[1
]}

Senn, D ^{[1
]}

机构：

[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia

来源：

SIGNAL PROCESSING FOR TELECOMMUNICATIONS AND MULTIMEDIA | 2005年 / 27卷

关键词：

temporal masking model; simultaneous masking model; Gammatone filters; wavelet packet; PESQ; subjective listening test;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a comparison of three auditory temporal masking models for speech and audio coding applications. The first model was developed based upon the existing forward masking psychoacoustic data with an assumption of ail approximately 200 ms. The model's dynamic parameters were derived from this data. The previously developed second model was,: based upon the principle of an exponential decay following higher energy stimuli, where the masking effects have a relatively short duration. The existing third model best matches the previously reported forward masking, data using ail exponential curve but the effects of the Forward masking are restricted to 100-200ms. Objective assessments employing the PESQ measure reveal that these three ternporal models have potential for removing perceptually redundant information in speech and audio coding, applications. Results show that the incorporation of temporal masking along with simultaneous masking into a speech/audio coding algorithm results in a further bit rate reduction of approximately 17% compared with simultaneous masking alone. while preserving perceptual quality.

引用

页码：31 / 42

页数：12

共 50 条

[21] Speech/audio coding technologies and their applications
Kaneko, Takao, 2000, NTT, Tokyo, Japan (49):
[22] A new psychoacoustical masking model for audio coding applications
van de Par, S
Kohlrausch, A
Charestan, G
Heusdens, R
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 1805 - 1808
[23] A new auditory masking model for speech and audio coders
Sen, D
Allen, JB
1997 IEEE WORKSHOP ON SPEECH CODING FOR TELECOMMUNICATIONS, PROCEEDINGS: BACK TO BASICS: ATTACKING FUNDAMENTAL PROBLEMS IN SPEECH CODING, 1997, : 53 - 54
[24] Universal speech/audio coding using hybrid ACELP/TCX techniques
Bessette, B
Lefebvre, R
Salami, R
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 301 - 304
[25] Combined speech and audio coding using non-linear adaptations
Chan, CF
1997 IEEE WORKSHOP ON SPEECH CODING FOR TELECOMMUNICATIONS, PROCEEDINGS: BACK TO BASICS: ATTACKING FUNDAMENTAL PROBLEMS IN SPEECH CODING, 1997, : 105 - 106
[26] Audio and Speech Compression using Sinusoidal Modeling and Wavelet Residuum Coding
Nagy, Martin Turi
Vargic, Radoslav
PROCEEDINGS ELMAR-2012, 2012, : 207 - 210
[27] Postfiltering Using Log-Magnitude Spectrum for Speech and Audio Coding
Das, Sneha
Backstrom, Tom
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3543 - 3547
[28] Non-uniform speech/audio coding exploiting predictability of temporal evolution of spectral envelopes
Motlicek, Petr
Hermansky, Hynek
Ganapathy, Sriram
Garudadri, Harinath
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 350 - +
[29] CODING OF SPEECH AND WIDE-BAND AUDIO
JAYANT, NS
LAWRENCE, VB
PREZAS, DP
AT&T TECHNICAL JOURNAL, 1990, 69 (05): : 25 - 41
[30] Wideband speech and audio coding in the perceptual domain
Lin, L
Ambikairajah, E
Holmes, WH
ADVANCED SIGNAL PROCESSING FOR COMMUNICATION SYSTEMS, 2002, 703 : 15 - 30

← 1 2 3 4 5 →