A perceptual model for sinusoidal audio coding based on spectral integration

被引：49

作者：

van de Par, S ^{[1
]}

Kohlrausch, A

Heusdens, R

Jensen, J

Jensen, SH

机构：

[1] Philips Res Labs, Digital Signal Proc Grp, NL-5656 AA Eindhoven, Netherlands

[2] Eindhoven Univ Technol, Dept Technol Management, NL-5600 MB Eindhoven, Netherlands

[3] Delft Univ Technol, Dept Mediamat, NL-2600 GA Delft, Netherlands

[4] Aalborg Univ, Inst Electron Syst, Dept Commun Technol, DK-9220 Aalborg, Denmark

来源：

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING | 2005年 / 2005卷 / 09期

关键词：

audio coding; psychoacoustical modelling; auditory masking; spectral masking; sinusoidal modelling; psychoacoustical matching pursuit;

D O I：

10.1155/ASP.2005.1292

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals. In this paper, we present a new perceptual model that predicts masked thresholds for sinusoidal distortions. The model relies on signal detection theory and incorporates more recent insights about spectral and temporal integration in auditory masking. As a consequence, the model is able to predict the distortion detectability. In fact, the distortion delectability defines a (perceptually relevant) norm on the underlying signal space which is beneficial for optimisation algorithms such as rate-distortion optimisation or linear predictive coding. We evaluate the merits of the model by combining it with a sinusoidal extraction method and compare the results with those obtained with the ISO MPEG-1 Layer I-II recommended model. Listening tests show a clear preference for the new model. More specifically, the model presented here leads to a reduction of more than 20% in terms of number of sinusoids needed to represent signals at a given quality level.

引用

页码：1292 / 1304

页数：13

共 50 条

[21] Fine grain scalable perceptual and lossless audio coding based on IntMDCT
Geiger, R
Herre, J
Schuller, G
Sporer, T
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO AND ELECTROACOUSTICS MULTIMEDIA SIGNAL PROCESSING, 2003, : 445 - 448
[22] Fine grain scalable perceptual and lossless audio coding based on IntMDCT
Geiger, R
Schuller, G
Sporer, T
Herre, J
2003 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS PROCEEDINGS, 2003, : 50 - 50
[23] Neural-Based Approach to Perceptual Sparse Coding of Audio Signals
Pichevar, Ramin
Najaf-Zadeh, Hossein
Mustiere, Frederic
2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
[24] Audio Classification Based on Sinusoidal Model: A New Feature
Shirazi, Jalil
Ghaemmaghami, Shahrokh
2008 IEEE REGION 10 CONFERENCE: TENCON 2008, VOLS 1-4, 2008, : 1978 - +
[25] Wideband speech and audio coding in the perceptual domain
Lin, L
Ambikairajah, E
Holmes, WH
ADVANCED SIGNAL PROCESSING FOR COMMUNICATION SYSTEMS, 2002, 703 : 15 - 30
[26] Transparent Bitrate Estimation for Perceptual Audio Coding
Li, Te
Rahardja, Susanto
ICIEA: 2009 4TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOLS 1-6, 2009, : 2114 - 2118
[27] Bark scale-based perceptual matching pursuit for improving sinusoidal audio modeling
Vera-Candeas, P.
Ruiz-Reyes, N.
Lopez-Ferreras, F.
DIGITAL SIGNAL PROCESSING, 2009, 19 (02) : 229 - 240
[28] Sinusoidal analysis-synthesis of audio using perceptual criteria
Painter, T
Spanias, A
2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, PROCEEDINGS, 2002, : 177 - 180
[29] Perceptual segmentation and component selection in compact sinusoidal representations of audio
Painter, T
Spanias, A
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 3289 - 3292
[30] Amplitude modulated sinusoidal signal decomposition for audio coding
Christensen, Mads Graesboll
Jakobsson, Andreas
Andersen, Soren Vang
Jensen, Soren Holdt
IEEE SIGNAL PROCESSING LETTERS, 2006, 13 (07) : 389 - 392

← 1 2 3 4 5 →