Informed source separation through spectrogram coding and data embedding

被引：41

作者：

Liutkus, Antoine ^{[1
]}

Pinel, Jonathan ^{[2
]}

Badeau, Roland ^{[1
]}

Girin, Laurent ^{[2
]}

Richard, Gael ^{[1
]}

机构：

[1] Telecom ParisTech, CNRS LTCI, Inst Telecom, F-75014 Paris, France

[2] Grenoble Inst Technol, F-38402 Grenoble, France

来源：

SIGNAL PROCESSING | 2012年 / 92卷 / 08期

关键词：

Audio source separation; Wiener filtering; Data embedding; NTF; NONNEGATIVE MATRIX FACTORIZATION; WATERMARKING-BASED METHOD; AUDIO; ALGORITHMS; MIXTURES;

D O I：

10.1016/j.sigpro.2011.09.016

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We address the issue of underdetermined source separation in a particular informed configuration where both the sources and the mixtures are known during a so-called encoding stage. This knowledge enables the computation of a side-information which is small enough to be inaudibly embedded into the mixtures. At the decoding stage, the sources are no longer assumed to be known, only the mixtures and the extracted side-information are processed for source separation. The proposed system models the sources as independent and locally stationary Gaussian processes (GP) and the mixing process as a linear filtering. This model allows reliable estimation of the sources through generalized Wiener filtering, provided their spectrograms are known. As these spectrograms are too large to be embedded in the mixtures, we show how they can be efficiently approximated using either Nonnegative Tensor Factorization (NTF) or image compression. A high-capacity embedding method is used by the system to inaudibly embed the separation side-information into the mixtures. This method is an application of the Quantization Index Modulation technique applied to the time-frequency coefficients of the mixtures and permits to reach embedding rates of about 250 kbps. Finally, a study of the performance of the full system is presented. (c) 2011 Elsevier B.V. All rights reserved.

引用

页码：1937 / 1949

页数：13

共 50 条

[31] Informed Source Separation Using Iterative Reconstruction
Sturmel, Nicolas
Daudet, Laurent
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (01): : 176 - 183
[32] An Informed Source Separation System for Speech Signals
Zhang, Shuhua
Girin, Laurent
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 580 - 583
[33] Informed Source Separation Using Latent Components
Liutkus, Antoine
Badeau, Roland
Richard, Gael
LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION, 2010, 6365 : 498 - 505
[34] NMF-BASED INFORMED SOURCE SEPARATION
Rohlfing, Christian
Becker, Julian M.
Wien, Mathias
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 474 - 478
[35] Coding on demand by an informed source (ISCOD) for efficient broadcast of different supplemental data to caching clients
Birk, Yitzhak
Kol, Tomer
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (06) : 2825 - 2830
[36] Effect of channel coding in data embedding in images
Abdulaziz, N
Pang, KK
IMAGE COMPRESSION AND ENCRYPTION TECHNOLOGIES, 2001, 4551 : 27 - 31
[37] Applying informed coding and embedding to design a robust high-capacity watermark
Miller, ML
Doërr, GJ
Cox, IJ
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2004, 13 (06) : 792 - 807
[38] PHONE-INFORMED REFINEMENT OF SYNTHESIZED MEL SPECTROGRAM FOR DATA AUGMENTATION IN SPEECH RECOGNITION
Ueno, Sei
Kawahara, Tatsuya
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2022, 2022-May : 8572 - 8576
[39] LOW BITRATE INFORMED SOURCE SEPARATION OF REALISTIC MIXTURES
Liutkus, Antoine
Badeau, Roland
Richard, Gael
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 66 - 70
[40] Generalized Constraints for NMF with Application to Informed Source Separation
Rohlfing, Christian
Becker, Julian M.
2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 597 - 601

← 1 2 3 4 5 →