Informed source separation through spectrogram coding and data embedding

被引：41

作者：

Liutkus, Antoine ^{[1
]}

Pinel, Jonathan ^{[2
]}

Badeau, Roland ^{[1
]}

Girin, Laurent ^{[2
]}

Richard, Gael ^{[1
]}

机构：

[1] Telecom ParisTech, CNRS LTCI, Inst Telecom, F-75014 Paris, France

[2] Grenoble Inst Technol, F-38402 Grenoble, France

来源：

SIGNAL PROCESSING | 2012年 / 92卷 / 08期

关键词：

Audio source separation; Wiener filtering; Data embedding; NTF; NONNEGATIVE MATRIX FACTORIZATION; WATERMARKING-BASED METHOD; AUDIO; ALGORITHMS; MIXTURES;

D O I：

10.1016/j.sigpro.2011.09.016

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We address the issue of underdetermined source separation in a particular informed configuration where both the sources and the mixtures are known during a so-called encoding stage. This knowledge enables the computation of a side-information which is small enough to be inaudibly embedded into the mixtures. At the decoding stage, the sources are no longer assumed to be known, only the mixtures and the extracted side-information are processed for source separation. The proposed system models the sources as independent and locally stationary Gaussian processes (GP) and the mixing process as a linear filtering. This model allows reliable estimation of the sources through generalized Wiener filtering, provided their spectrograms are known. As these spectrograms are too large to be embedded in the mixtures, we show how they can be efficiently approximated using either Nonnegative Tensor Factorization (NTF) or image compression. A high-capacity embedding method is used by the system to inaudibly embed the separation side-information into the mixtures. This method is an application of the Quantization Index Modulation technique applied to the time-frequency coefficients of the mixtures and permits to reach embedding rates of about 250 kbps. Finally, a study of the performance of the full system is presented. (c) 2011 Elsevier B.V. All rights reserved.

引用

页码：1937 / 1949

页数：13

共 50 条

[41] COMPRESSIVE SAMPLING-BASED INFORMED SOURCE SEPARATION
Bilen, Cagdas
Ozerov, Alexey
Perez, Patrick
2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2015,
[42] PHONE-INFORMED REFINEMENT OF SYNTHESIZED MEL SPECTROGRAM FOR DATA AUGMENTATION IN SPEECH RECOGNITION
Ueno, Sei
Kawahara, Tatsuya
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8572 - 8576
[43] On causal and semicausal codes for joint information embedding and source coding
Merhav, N
Ordentlich, E
2004 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, PROCEEDINGS, 2004, : 196 - 196
[44] LDPC Codes for Information Embedding and Lossy Distributed Source Coding
Sartipi, Mina
2010 DATA COMPRESSION CONFERENCE (DCC 2010), 2010, : 551 - 551
[45] On causal and semicausal codes for joint information embedding and source coding
Merhav, N
Ordentlich, E
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (01) : 213 - 226
[46] Separation of Source-Network Coding and Channel Coding in Wireline Networks
Jalali, Shirin
Effros, Michelle
IEEE TRANSACTIONS ON INFORMATION THEORY, 2015, 61 (04) : 1524 - 1538
[47] Orthogonal dirty paper coding for informed data hiding
Abrardo, A
Barni, M
SECURITY, STEGANOGRAPHY, AND WATERMARKING OF MULTIMEDIA CONTENTS VI, 2004, 5306 : 274 - 285
[48] Data Representation by Joint Hypergraph Embedding and Sparse Coding
Zhong, Guo
Pun, Chi-Man
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (05) : 2106 - 2119
[49] A lossless data embedding technique by joint neighboring coding
Chang, Chin-Chen
Kieu, The Duc
Wu, Wen-Chuan
PATTERN RECOGNITION, 2009, 42 (07) : 1597 - 1603
[50] A convex formulation for informed source separation in the single channel setting
Lefevre, Augustin
Glineur, Francois
Absil, P. -A.
NEUROCOMPUTING, 2014, 141 : 26 - 36

← 1 2 3 4 5 →