SHIFTED AND CONVOLUTIVE SOURCE-FILTER NON-NEGATIVE MATRIX FACTORIZATION FOR MONAURAL AUDIO SOURCE SEPARATION

被引:0
|
作者
Nakamura, Tomohiko [1 ]
Kameoka, Hirokazu [1 ,2 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Bunkyo Ku, 7-3-1 Hongo, Tokyo 1138656, Japan
[2] NTT Corp, NTT Commun Sci Labs, 3-1 Morinosato Wakamiya, Atsugi, Kanagawa 2430198, Japan
关键词
Audio source separation; Shifted non-negative matrix factorization; Shift-invariant probabilistic latent component analysis; Source-filter theory; SPARSE;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes an extension of non-negative matrix factorization (NMF), which combines the shifted NMF model with the source-filter model. Shifted NMF was proposed as a powerful approach for monaural source separation and multiple fundamental frequency (F-0) estimation, which is particularly unique in that it takes account of the constant inter-harmonic spacings of a harmonic structure in log-frequency representations and uses a shifted copy of a spectrum template to represent the spectra of different F(0)s. However, for those sounds that follow the source-filter model, this assumption does not hold in reality, since the filter spectra are usually invariant under F-0 changes. A more reasonable way to represent the spectrum of a different F-0 is to use a shifted copy of a harmonic structure template as the excitation spectrum and keep the filter spectrum fixed. Thus, we can describe the spectrogram of a mixture signal as the sum of the products between the shifted copies of excitation spectrum templates and filter spectrum templates. Furthermore, the time course of filter spectra represents the dynamics of the timbre, which is important for characterizing the feature of an instrument sound. Thus, we further incorporate the non-negative matrix factor deconvolution (NMFD) model into the above model to describe the filter spectrogram. We derive a computationally efficient and convergence-guaranteed algorithm for estimating the unknown parameters of the constructed model based on the auxiliary function approach. Experimental results revealed that the proposed method outperformed shifted NMF in terms of the source separation accuracy.
引用
收藏
页码:489 / 493
页数:5
相关论文
共 50 条
  • [31] On Ambisonic Source Separation With Spatially Informed Non-Negative Tensor Factorization
    Guzik, Mateusz
    Kowalczyk, Konrad
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3238 - 3255
  • [32] Clustering Algorithm for Unsupervised Monaural Musical Sound Separation Based on Non-negative Matrix Factorization
    Park, Sang Ha
    Lee, Seokjin
    Sung, Koeng-Mo
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2012, E95A (04) : 818 - 823
  • [33] Alternating Direction Method of Multipliers for Convolutive Non-Negative Matrix Factorization
    Li, Yinan
    Wang, Ruili
    Fang, Yuqiang
    Sun, Meng
    Luo, Zhangkai
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (12) : 7735 - 7748
  • [34] Non-negative Hidden Markov Modeling of Audio with Application to Source Separation
    Mysore, Gautham J.
    Smaragdis, Paris
    Raj, Bhiksha
    LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION, 2010, 6365 : 140 - +
  • [35] Audio source separation of convolutive mixtures
    Mitianoudis, N
    Davies, ME
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (05): : 489 - 497
  • [36] MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION IN CONVOLUTIVE MIXTURES. WITH APPLICATION TO BLIND AUDIO SOURCE SEPARATION.
    Ozerov, Alexey
    Fevotte, Cedric
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3137 - +
  • [37] Automatic Model Order Selection for Convolutive Non-Negative Matrix Factorization
    Li, Yinan
    Zhang, Xiongwei
    Sun, Meng
    Jia, Chong
    Zou, Xia
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (10) : 1867 - 1870
  • [38] Non-negative Matrix Factorization-Based Blind Source Separation for Non-contact Heartbeat Detection
    Ye, Chen
    Toyoda, Kentaroh
    Ohtsuki, Tomoaki
    ICC 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2019,
  • [39] Stopping Criteria for Non-Negative Matrix Factorization Based Supervised and Semi-Supervised Source Separation
    Germain, Franois G.
    Mysore, Gautham J.
    IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (10) : 1284 - 1288
  • [40] Perceptually Weighted Non-negative Matrix Factorization for Blind Single-Channel Music Source Separation
    Kirbiz, S.
    Gunsel, B.
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 226 - 229