Multichannel Singing Voice Separation by Deep Neural Network Informed DOA Constrained CMNMF

被引:0
|
作者
Munoz-Montoro, Antonio J. [1 ]
Politis, Archontis [2 ]
Drossos, Konstantinos [2 ]
Carabias-Orti, Julio J. [1 ]
机构
[1] Univ Jaen, Telecommun Engn Dept, Jaen, Spain
[2] Tampere Univ, Audio Res Grp, Tampere, Finland
基金
欧洲研究理事会;
关键词
Multichannel Source Separation; Singing Voice; Deep Learning; CMNMF; Spatial Audio; SPATIAL COVARIANCE MODEL; AUDIO SOURCE SEPARATION; NONNEGATIVE MATRIX;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This work addresses the problem of multichannel source separation combining two powerful approaches, multichannel spectral factorization with recent monophonic deep learning (DL) based spectrum inference. Individual source spectra at different channels are estimated with a Masker-Denoiser twin network, able to model long-term temporal patterns of a musical piece. The monophonic source spectrograms are used within a spatial covariance mixing model based on complex-valued multichannel non-negative matrix factorization (CMNMF) that predicts the spatial characteristics of each source. The proposed framework is evaluated on the task of singing voice separation with a large multichannel dataset. Experimental results show that our joint DL+CMNMF method outperforms both the individual monophonic DL-based separation and the multichannel CMNMF baseline methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [11] Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System
    Hono, Yukiya
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2803 - 2815
  • [12] High-Resolution Representation Learning and Recurrent Neural Network for Singing Voice Separation
    Bhuwan Bhattarai
    Yagya Raj Pandeya
    You Jie
    Arjun Kumar Lamichhane
    Joonwhoan Lee
    Circuits, Systems, and Signal Processing, 2023, 42 : 1083 - 1104
  • [13] Monophonic Singing Voice Separation Based on Deep Learning
    Wang, Yutian
    Zhang, Zhao
    Wang, Zheng
    Cai, JuanJuan
    Wang, Hui
    2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 491 - 495
  • [14] High-Resolution Representation Learning and Recurrent Neural Network for Singing Voice Separation
    Bhattarai, Bhuwan
    Pandeya, Yagya Raj
    Jie, You
    Lamichhane, Arjun Kumar
    Lee, Joonwhoan
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (02) : 1083 - 1104
  • [15] 3 directional Inception-ResUNet: Deep spatial feature learning for multichannel singing voice separation with distortion
    Wang, Dadong
    Wang, Jie
    Sun, Mingchen
    PLOS ONE, 2024, 19 (01):
  • [16] Enhanced feature network for monaural singing voice separation
    Yuan, Weitao
    He, Boxin
    Wang, Shengbei
    Wang, Jianming
    Unoki, Masashi
    SPEECH COMMUNICATION, 2019, 106 : 1 - 6
  • [17] A separation method of singing and accompaniment combining discriminative training deep neural network
    ZHANG Tianqi
    XIONG Mei
    ZHANG Ting
    YANG Qiang
    Chinese Journal of Acoustics, 2019, 38 (02) : 227 - 239
  • [18] A separation method of singing and accompaniment combining discriminative training deep neural network
    Zhang, Tianqi
    Xiong, Mei
    Zhang, Ting
    Yang, Qiang
    Shengxue Xuebao/Acta Acustica, 2019, 44 (03): : 393 - 400
  • [19] SINGING VOICE DETECTION WITH DEEP RECURRENT NEURAL NETWORKS
    Leglaive, Simon
    Hennequin, Romain
    Badeau, Roland
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 121 - 125
  • [20] Singing voice synthesis based on deep neural networks
    Nishimura, Masanari
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2478 - 2482