Rank-Based Frame Classification for Usable Speech Detection in Speaker Identification Systems

被引：0

作者：

Ethridge, James ^{[1
]}

Ramachandran, Ravi P. ^{[1
]}

机构：

[1] Rowan Univ, Dept Elect & Comp Engn, Glassboro, NJ 08028 USA

来源：

2015 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP) | 2015年

关键词：

speaker identification; usable frames; Gaussian mixture model; Mahalanobis distance; decision tree; boosting; additive noise;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The performance of a speaker identification (SID) system degrades substantially when there is a mismatch between the training and testing conditions. Discriminating between temporal sections of speech signals which are speech-like (SID usable) and noise-like (SID unusable) while only retaining frames labeled SID usable can augment SID performance substantially. In this paper, a novel labeling system for SID usable and SID unusable frames is presented for a GMM based SID system. This is motivated by a control experiment demonstrating that very high SID accuracies are theoretically achievable by removing frames that contribute more to the scores of competing speakers rather than the true speaker. To blindly identify these SID usable and unusable frames, the Mahalanobis distance and an ensemble of decision tree classifiers (with boosting) were trained on a dataset which was different from the enrollment database for the SID system. The classifier based techniques yielded improvements over the base speaker identification system (all frames used) in all cases when the speech signal was corrupted with additive white or additive pink noise.

引用

页码：292 / 296

页数：5

共 50 条

[21] A fuzzy rank-based ensemble of CNN models for classification of cervical cytology
Ankur Manna
Rohit Kundu
Dmitrii Kaplun
Aleksandr Sinitca
Ram Sarkar
Scientific Reports, 11
[22] A fuzzy rank-based ensemble of CNN models for classification of cervical cytology
Manna, Ankur
Kundu, Rohit
Kaplun, Dmitrii
Sinitca, Aleksandr
Sarkar, Ram
SCIENTIFIC REPORTS, 2021, 11 (01)
[23] Usable speech detection based on empirical mode decomposition
Ghezaiel, W.
Ben Slimanne, A.
Ben Braiek, E.
ELECTRONICS LETTERS, 2013, 49 (07) : 503 - 504
[24] Personal Re-identification Using Rank-based Manifold Ranking
Hsieh, Cheng-Ta
Han, Chin-Chun
Fan, Kuo-Chin
49TH ANNUAL IEEE INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST), 2015, : 391 - 394
[25] Rank-Based Record Linkage for Re-Identification Risk Assessment
Muralidhar, Krishnamurty
Domingo-Ferrer, Josep
PRIVACY IN STATISTICAL DATABASES: UNESCO CHAIR IN DATA PRIVACY, 2016, 9867 : 225 - 236
[26] FRAME LEVEL ENTROPY BASED OVERLAPPED SPEECH DETECTION AS A PRE-PROCESSING STAGE FOR SPEAKER DIARIZATION
Ben-Harush, Oshry
Guterman, Hugo
Lapidot, Itshak
2009 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2009, : 321 - +
[27] Dermoscopy lesion classification based on GANs and a fuzzy rank-based ensemble of CNN models
Li, Haiyan
Li, Wenqing
Chang, Jun
Zhou, Liping
Luo, Jin
Guo, Yifan
PHYSICS IN MEDICINE AND BIOLOGY, 2022, 67 (18):
[28] Weakly supervised classification through manifold learning and rank-based contextual measures
Presotto, Joao Gabriel Camacho
Valem, Lucas Pascotti
de Sa, Nikolas Gomes
Pedronette, Daniel Carlos Guimaraes
Papa, Joao Paulo
NEUROCOMPUTING, 2024, 589
[29] TWO MICROPHONES SPEECH ENHANCEMENT SYSTEMS BASED ON INSTRUMENTAL VARIABLE ALGORITHM FOR SPEAKER IDENTIFICATION
Gabrea, Marcel
2011 24TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2011, : 569 - 572
[30] Speaker Identification Using Robust Speech Detection and Neural Network
Ouzounov, Atanas
CYBERNETICS AND INFORMATION TECHNOLOGIES, 2007, 7 (03) : 48 - 54

← 1 2 3 4 5 →