Gain Adapted Optimum Mixture Estimation Scheme for Single Channel Speech Separation

被引:2
|
作者
Kapoor, Divneet Singh [1 ]
Kohli, Amit Kumar [2 ]
机构
[1] Chandigarh Grp Coll, Dept Elect & Commun Engn, Gharuan, Mohali, India
[2] Thapar Univ, Dept Elect & Commun Engn, Patiala 147004, Punjab, India
关键词
Single channel speech separation (SCSS); Optimum mixture estimator; Mixture-maximization (MixMax); Quadratic estimator; Gain adaptation; BLIND SOURCE SEPARATION; SEGREGATION; RECOGNITION; DRIVEN; SOUND;
D O I
10.1007/s00034-013-9566-7
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents the proof of an Optimum mixture estimator for the single channel speech separation problem, which is a technique for separating two speech signals from a single recording of their mixture. The presented work is an attempt to solve a fundamental limitation in the current single channel speech separation techniques, in which it is assumed that the data used in the training as well as test phases of the separation model have the same energy levels. To overcome this limitation, a gain adapted Optimum mixture estimator is derived, which estimates the mixture of speech signals under the different signal-to-signal ratios (SSRs). Specifically, the speakers' gains are incorporated as unknown parameters into the separation model, and then the estimator is derived in terms of the source distributions and SSR. It is demonstrated that the use of the Optimum mixture estimator results in the lower estimation error than the non-linear mapping (log and inverse-log operations)-based Mixture-Maximization (MixMax) or Quadratic estimators. The experimental results based on the real speech data also depict that the proposed estimator improves the mixture estimation performance significantly when compared with MixMax or Quadratic estimators with the gain adaptation.
引用
收藏
页码:2335 / 2351
页数:17
相关论文
共 50 条
  • [1] Gain Adapted Optimum Mixture Estimation Scheme for Single Channel Speech Separation
    Divneet Singh Kapoor
    Amit Kumar Kohli
    Circuits, Systems, and Signal Processing, 2013, 32 : 2335 - 2351
  • [2] Optimum Mixture Estimator for single-channel Speech Separation
    Mowlaee, Pejman
    Sayadiyan, Abolghassem
    Sheikhan, Mansour
    2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 543 - +
  • [3] GAIN ESTIMATION IN MODEL-BASED SINGLE CHANNEL SPEECH SEPARATION
    Radfar, M. H.
    Wong, W.
    Chan, W-Y.
    Dansereau, R. M.
    2009 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2009, : 423 - +
  • [4] Long-term gain estimation in model-based single channel speech separation
    Radfar, M. H.
    Dansereau, R. M.
    2007 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2007, : 185 - 188
  • [5] Monaural Speech Separation Based on Gain Adapted Minimum Mean Square Error Estimation
    M. H. Radfar
    R. M. Dansereau
    W.-Y. Chan
    Journal of Signal Processing Systems, 2010, 61 : 21 - 37
  • [6] Monaural Speech Separation Based on Gain Adapted Minimum Mean Square Error Estimation
    Radfar, M. H.
    Dansereau, R. M.
    Chan, W-Y.
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2010, 61 (01): : 21 - 37
  • [7] Single Channel Speech Separation Using Maximum a Posteriori Estimation
    Radfar, M. H.
    Dansereau, R. M.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 841 - 844
  • [8] Phase estimation for signal reconstruction in single-channel speech separation
    Mowlaee, Pejman
    Saeidi, Rahim
    Martin, Rainer
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1546 - 1549
  • [9] Single-Channel Speech-Music Separation for Robust ASR With Mixture Models
    Demir, Cemil
    Saraclar, Murat
    Cemgil, Ali Taylan
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (04): : 725 - 736
  • [10] SINUSOIDAL MASKS FOR SINGLE CHANNEL SPEECH SEPARATION
    Mowlaee, Pejman
    Christensen, Mads Graesboll
    Jensen, Soren Holdt
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4262 - 4265