Cochannel Speech Separation Using Multi-pitch Estimation and Model Based Voiced Sequential Grouping

被引:0
|
作者
Li, Ming [1 ]
Cao, Chuan [1 ]
Wang, Di [1 ]
Lu, Ping [1 ]
Fu, Qiang [1 ]
Yan, Yonghong [1 ]
机构
[1] Chinese Acad Sci, ThinkIT Speech Lab, Inst Acoust, Beijing 100190, Peoples R China
关键词
Auditory scene analysis; cochannel speech; multi-pitch estimation; sequential grouping;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a new cochannel speech separation algorithm using multi-pitch extraction and speaker model based sequential grouping is proposed. After auditory segmentation based on onset and offset analysis, robust multi-pitch estimation algorithm is performed on each segment and the corresponding voiced portions are segregated. Then speaker pair model based on support vector machine (SVM) is employed to determine the optimal sequential grouping alignments and group the speaker homogeneous segments into pure speaker streams. Systematic evaluation on the speech separation challenge database shows significant improvement over the baseline performance.
引用
收藏
页码:151 / 154
页数:4
相关论文
共 50 条
  • [41] MULTI-PITCH TRACKING USING GAUSSIAN MIXTURE MODEL WITH TIME VARYING PARAMETERS AND GRATING COMPRESSION TRANSFORM
    Abhijith, M. N.
    Ghosh, Prasanta K.
    Rajgopal, K.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [42] LEARNING MULTI-PITCH ESTIMATION FROM WEAKLY ALIGNED SCORE-AUDIO PAIRS USING A MULTI-LABEL CTC LOSS
    Weiss, Christof
    Peeters, Geoffroy
    2021 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2021, : 121 - 125
  • [43] Speech Separation and Recognition Using CASA Segmentation and Language-Based Grouping
    Karpukhin, Ivan
    Konushin, Anton
    ADVANCED SCIENCE LETTERS, 2018, 24 (10) : 7650 - 7654
  • [44] A Sequential Processing Model for Speech Separation Based on Auditory Scene Analysis
    Nakanishi, Isao
    Hanada, Junichi
    2015 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2015, : 124 - 128
  • [45] Speech Separation Using a Composite Model for Complex Mask Estimation
    Hasannezhad, Mojtaba
    Ouyang, Zhiheng
    Zhu, Wei-Ping
    Champagne, Benoit
    2020 IEEE 63RD INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2020, : 578 - 581
  • [46] Investigation of the spectral envelope estimation vocoder and improved pitch estimation based on the sinusoidal speech model
    Zhang, WH
    Kim, HS
    Holmes, WH
    ICICS - PROCEEDINGS OF 1997 INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING, VOLS 1-3: THEME: TRENDS IN INFORMATION SYSTEMS ENGINEERING AND WIRELESS MULTIMEDIA COMMUNICATIONS, 1997, : 513 - 516
  • [47] Pitch Estimation Based on the Cepstrum Analysis by the Multi Scale Product of Clean and Noisy Speech
    Jlassi, Wided
    Bouzid, Aicha
    Ellouze, Noureddine
    RECENT ADVANCES IN NONLINEAR SPEECH PROCESSING, 2016, 48 : 219 - 225
  • [48] Audio stream segregation of multi-pitch music signal based on time-space clustering using Gaussian kernel 2-dimensional model
    Kameoka, H
    Nishimoto, T
    Sagayama, S
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 5 - 8
  • [49] GAIN ESTIMATION IN MODEL-BASED SINGLE CHANNEL SPEECH SEPARATION
    Radfar, M. H.
    Wong, W.
    Chan, W-Y.
    Dansereau, R. M.
    2009 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2009, : 423 - +
  • [50] Network Attack Traffic Detection using Seed based Sequential Grouping Model
    Park, Jee-Tae
    Lee, Sung -Ho
    Goo, Young-Hoon
    Baek, Ui-Jun
    Kim, Myung-Sup
    NOMS 2018 - 2018 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, 2018,