SPARSENESS-BASED MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR BLIND SOURCE SEPARATION

被引:0
|
作者
Higuchi, Takuya [1 ]
Yoshioka, Takuya [1 ]
Nakatani, Tomohiro [1 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Tokyo, Japan
关键词
audio source separation; sparseness; nonnegative matrix factorization; MIXTURES;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper deals with the problem of audio source separation using multichannel observation. Utilizing the sparseness of sound signals in the time-frequency domain is a successful approach to source separation that enables us to perform separation based on spatial features obtained from a microphone array. On the other hand, nonnegative matrix factorization (NMF) is also a promising approach for audio source separation, which performs separation based on spectral features. This paper incorporates the idea of NMF into sparseness-based source separation and proposes a novel approach to multichannel source separation based on both spatial and spectral features. Experimental results reveal that our proposed method improves the signal-to-distortion ratio (SDR) by 0.26 dB and the signal-to-interference ratio (SIR) by 1.96 dB compared with a conventional sparseness-based approach. In addition, our proposed model eliminates the need for a number of matrix inversions thanks to the sparseness assumption, and thereby requires a much lower computational cost than a previously-proposed multichannel NMF approach, which also utilizes spectral and spatial features.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Supervised and Constrained Nonnegative Matrix Factorization with Sparseness for Image Representation
    Cai, Xibiao
    Sun, Fuming
    WIRELESS PERSONAL COMMUNICATIONS, 2018, 102 (04) : 3055 - 3066
  • [42] Blind image separation using Nonnegative Matrix Factorization with Gibbs smoothing
    Zdunek, Rafal
    Cichocki, Andrzej
    NEURAL INFORMATION PROCESSING, PART II, 2008, 4985 : 519 - +
  • [43] Supervised Audio Source Separation Based on Nonnegative Matrix Factorization with Cosine Similarity Penalty
    Iwase, Yuta
    Kitamura, Daichi
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2022, E105A (06) : 906 - 913
  • [44] Illumination Estimation Based on Estimation of Dominant Chromaticity in Nonnegative Matrix Factorization with Sparseness Constraint
    Lee, Ji-Heon
    Yoo, Ji-Hoon
    Sung, Jung-Min
    Ha, Yeong-Ho
    COLOR IMAGING XX: DISPLAYING, PROCESSING, HARDCOPY, AND APPLICATIONS, 2015, 9395
  • [45] Initialization of Nonnegative Matrix Factorization Dictionaries for Single Channel Source Separation
    Grais, Emad M.
    Erdogan, Hakan
    2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
  • [46] Online Blind Separation of Dependent Sources Using Nonnegative Matrix Factorization Based on KL Divergence
    Li, Hui
    Shen, Yue-hong
    Wang, Jian-gong
    PRZEGLAD ELEKTROTECHNICZNY, 2012, 88 (1B): : 278 - 281
  • [47] NONNEGATIVE MATRIX PARTIAL CO-FACTORIZATION FOR DRUM SOURCE SEPARATION
    Yoo, Jiho
    Kim, Minje
    Kang, Kyeongok
    Choi, Seungjin
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 1942 - 1945
  • [48] Nonnegative Matrix Factorization Using Projected Gradient Algorithms with Sparseness Constraints
    Mohammadiha, Nasser
    Leijon, Arne
    2009 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2009), 2009, : 418 - 423
  • [49] Distant Sound Source Suppression Based on Multichannel Nonnegative Matrix Factorization with Bases Distance Maximization Penalty
    Takiguchi, Kazuma
    Kawamura, Arata
    Iiguni, Youji
    2019 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2019,
  • [50] Multichannel Signal Separation Combining Directional Clustering and Nonnegative Matrix Factorization with Spectrogram Restoration
    Kitamura, Daichi
    Saruwatari, Hiroshi
    Kameoka, Hirokazu
    Takahashi, Yu
    Kondo, Kazunobu
    Nakamura, Satoshi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (04) : 654 - 669