Rank-Based Frame Classification for Usable Speech Detection in Speaker Identification Systems

被引:0
|
作者
Ethridge, James [1 ]
Ramachandran, Ravi P. [1 ]
机构
[1] Rowan Univ, Dept Elect & Comp Engn, Glassboro, NJ 08028 USA
关键词
speaker identification; usable frames; Gaussian mixture model; Mahalanobis distance; decision tree; boosting; additive noise;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The performance of a speaker identification (SID) system degrades substantially when there is a mismatch between the training and testing conditions. Discriminating between temporal sections of speech signals which are speech-like (SID usable) and noise-like (SID unusable) while only retaining frames labeled SID usable can augment SID performance substantially. In this paper, a novel labeling system for SID usable and SID unusable frames is presented for a GMM based SID system. This is motivated by a control experiment demonstrating that very high SID accuracies are theoretically achievable by removing frames that contribute more to the scores of competing speakers rather than the true speaker. To blindly identify these SID usable and unusable frames, the Mahalanobis distance and an ensemble of decision tree classifiers (with boosting) were trained on a dataset which was different from the enrollment database for the SID system. The classifier based techniques yielded improvements over the base speaker identification system (all frames used) in all cases when the speech signal was corrupted with additive white or additive pink noise.
引用
收藏
页码:292 / 296
页数:5
相关论文
共 50 条
  • [21] A fuzzy rank-based ensemble of CNN models for classification of cervical cytology
    Ankur Manna
    Rohit Kundu
    Dmitrii Kaplun
    Aleksandr Sinitca
    Ram Sarkar
    Scientific Reports, 11
  • [22] A fuzzy rank-based ensemble of CNN models for classification of cervical cytology
    Manna, Ankur
    Kundu, Rohit
    Kaplun, Dmitrii
    Sinitca, Aleksandr
    Sarkar, Ram
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [23] Usable speech detection based on empirical mode decomposition
    Ghezaiel, W.
    Ben Slimanne, A.
    Ben Braiek, E.
    ELECTRONICS LETTERS, 2013, 49 (07) : 503 - 504
  • [24] Personal Re-identification Using Rank-based Manifold Ranking
    Hsieh, Cheng-Ta
    Han, Chin-Chun
    Fan, Kuo-Chin
    49TH ANNUAL IEEE INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST), 2015, : 391 - 394
  • [25] Rank-Based Record Linkage for Re-Identification Risk Assessment
    Muralidhar, Krishnamurty
    Domingo-Ferrer, Josep
    PRIVACY IN STATISTICAL DATABASES: UNESCO CHAIR IN DATA PRIVACY, 2016, 9867 : 225 - 236
  • [26] FRAME LEVEL ENTROPY BASED OVERLAPPED SPEECH DETECTION AS A PRE-PROCESSING STAGE FOR SPEAKER DIARIZATION
    Ben-Harush, Oshry
    Guterman, Hugo
    Lapidot, Itshak
    2009 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2009, : 321 - +
  • [27] Dermoscopy lesion classification based on GANs and a fuzzy rank-based ensemble of CNN models
    Li, Haiyan
    Li, Wenqing
    Chang, Jun
    Zhou, Liping
    Luo, Jin
    Guo, Yifan
    PHYSICS IN MEDICINE AND BIOLOGY, 2022, 67 (18):
  • [28] Weakly supervised classification through manifold learning and rank-based contextual measures
    Presotto, Joao Gabriel Camacho
    Valem, Lucas Pascotti
    de Sa, Nikolas Gomes
    Pedronette, Daniel Carlos Guimaraes
    Papa, Joao Paulo
    NEUROCOMPUTING, 2024, 589
  • [29] TWO MICROPHONES SPEECH ENHANCEMENT SYSTEMS BASED ON INSTRUMENTAL VARIABLE ALGORITHM FOR SPEAKER IDENTIFICATION
    Gabrea, Marcel
    2011 24TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2011, : 569 - 572
  • [30] Speaker Identification Using Robust Speech Detection and Neural Network
    Ouzounov, Atanas
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2007, 7 (03) : 48 - 54