Speech and audio coding using temporal masking

被引：0

作者：

Gunawan, TS ^{[1
]}

Ambikairajah, E ^{[1
]}

Senn, D ^{[1
]}

机构：

[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia

来源：

SIGNAL PROCESSING FOR TELECOMMUNICATIONS AND MULTIMEDIA | 2005年 / 27卷

关键词：

temporal masking model; simultaneous masking model; Gammatone filters; wavelet packet; PESQ; subjective listening test;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a comparison of three auditory temporal masking models for speech and audio coding applications. The first model was developed based upon the existing forward masking psychoacoustic data with an assumption of ail approximately 200 ms. The model's dynamic parameters were derived from this data. The previously developed second model was,: based upon the principle of an exponential decay following higher energy stimuli, where the masking effects have a relatively short duration. The existing third model best matches the previously reported forward masking, data using ail exponential curve but the effects of the Forward masking are restricted to 100-200ms. Objective assessments employing the PESQ measure reveal that these three ternporal models have potential for removing perceptually redundant information in speech and audio coding, applications. Results show that the incorporation of temporal masking along with simultaneous masking into a speech/audio coding algorithm results in a further bit rate reduction of approximately 17% compared with simultaneous masking alone. while preserving perceptual quality.

引用

页码：31 / 42

页数：12

共 50 条

[41] A temporal masking technique and its performace analysis for audio watermarking
Chou, Shuang-An
Hsieh, Shih-Fu
Li, Ko-Chiang
2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 1774 - 1777
[42] A new forward masking model and its application to perceptual audio coding
Huang, YH
Chiueh, TD
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 905 - 908
[43] Temporal summation under masking conditions and speech recognition
Rimskaya-Korsakova L.K.
Human Physiology, 2013, 39 (4) : 355 - 363
[44] Lattice Vector Quantization Applied to Speech and Audio Coding
Minjie Xie(ZTE USA Inc.
ZTECommunications, 2012, 10 (02) : 25 - 33
[45] Postfiltering with Complex Spectral Correlations for Speech and Audio Coding
Das, Sneha
Backstrom, Tom
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3538 - 3542
[46] Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach
Christian Feldbauer
Gernot Kubin
W. Bastiaan Kleijn
EURASIP Journal on Advances in Signal Processing, 2005
[47] Anthropomorphic coding of speech and audio: A model inversion approach
Feldbauer, C
Kubin, G
Kleijn, WB
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2005, 2005 (09) : 1334 - 1349
[48] A ROBUST SPEECH/MUSIC DISCRIMINATOR FOR SWITCHED AUDIO CODING
Fuchs, Guillaume
2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 569 - 573
[49] Amplitude - temporal method of speech coding
Ababii, V
Sudacevschi, V
Information Technologies 2004, 2004, 5822 : 76 - 82
[50] Temporal Masking with Luma Adjusted Interframe Coding for Underwater Exploration Using Acoustic Channel
Ali Akbar Siddique
M. Tahir Qadri
Noman Ahmed Siddiqui
Zia Mohy-ud-Din
Wireless Personal Communications, 2021, 116 : 1493 - 1506

← 1 2 3 4 5 →