Cochannel Speech Separation Using Multi-pitch Estimation and Model Based Voiced Sequential Grouping

被引：0

作者：

Li, Ming ^{[1
]}

Cao, Chuan ^{[1
]}

Wang, Di ^{[1
]}

Lu, Ping ^{[1
]}

Fu, Qiang ^{[1
]}

Yan, Yonghong ^{[1
]}

机构：

[1] Chinese Acad Sci, ThinkIT Speech Lab, Inst Acoust, Beijing 100190, Peoples R China

来源：

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年

关键词：

Auditory scene analysis; cochannel speech; multi-pitch estimation; sequential grouping;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, a new cochannel speech separation algorithm using multi-pitch extraction and speaker model based sequential grouping is proposed. After auditory segmentation based on onset and offset analysis, robust multi-pitch estimation algorithm is performed on each segment and the corresponding voiced portions are segregated. Then speaker pair model based on support vector machine (SVM) is employed to determine the optimal sequential grouping alignments and group the speaker homogeneous segments into pure speaker streams. Systematic evaluation on the speech separation challenge database shows significant improvement over the baseline performance.

引用

页码：151 / 154

页数：4

共 50 条

[41] MULTI-PITCH TRACKING USING GAUSSIAN MIXTURE MODEL WITH TIME VARYING PARAMETERS AND GRATING COMPRESSION TRANSFORM
Abhijith, M. N.
Ghosh, Prasanta K.
Rajgopal, K.
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[42] LEARNING MULTI-PITCH ESTIMATION FROM WEAKLY ALIGNED SCORE-AUDIO PAIRS USING A MULTI-LABEL CTC LOSS
Weiss, Christof
Peeters, Geoffroy
2021 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2021, : 121 - 125
[43] Speech Separation and Recognition Using CASA Segmentation and Language-Based Grouping
Karpukhin, Ivan
Konushin, Anton
ADVANCED SCIENCE LETTERS, 2018, 24 (10) : 7650 - 7654
[44] A Sequential Processing Model for Speech Separation Based on Auditory Scene Analysis
Nakanishi, Isao
Hanada, Junichi
2015 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2015, : 124 - 128
[45] Speech Separation Using a Composite Model for Complex Mask Estimation
Hasannezhad, Mojtaba
Ouyang, Zhiheng
Zhu, Wei-Ping
Champagne, Benoit
2020 IEEE 63RD INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2020, : 578 - 581
[46] Investigation of the spectral envelope estimation vocoder and improved pitch estimation based on the sinusoidal speech model
Zhang, WH
Kim, HS
Holmes, WH
ICICS - PROCEEDINGS OF 1997 INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING, VOLS 1-3: THEME: TRENDS IN INFORMATION SYSTEMS ENGINEERING AND WIRELESS MULTIMEDIA COMMUNICATIONS, 1997, : 513 - 516
[47] Pitch Estimation Based on the Cepstrum Analysis by the Multi Scale Product of Clean and Noisy Speech
Jlassi, Wided
Bouzid, Aicha
Ellouze, Noureddine
RECENT ADVANCES IN NONLINEAR SPEECH PROCESSING, 2016, 48 : 219 - 225
[48] Audio stream segregation of multi-pitch music signal based on time-space clustering using Gaussian kernel 2-dimensional model
Kameoka, H
Nishimoto, T
Sagayama, S
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 5 - 8
[49] GAIN ESTIMATION IN MODEL-BASED SINGLE CHANNEL SPEECH SEPARATION
Radfar, M. H.
Wong, W.
Chan, W-Y.
Dansereau, R. M.
2009 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2009, : 423 - +
[50] Network Attack Traffic Detection using Seed based Sequential Grouping Model
Park, Jee-Tae
Lee, Sung -Ho
Goo, Young-Hoon
Baek, Ui-Jun
Kim, Myung-Sup
NOMS 2018 - 2018 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, 2018,

← 1 2 3 4 5 →