Exploring methods of improving speaker accuracy for speaker diarization

被引:0
|
作者
Knox, Mary Tai [1 ,2 ]
Mirghafori, Nikki [1 ]
Friedland, Gerald [1 ]
机构
[1] Int Comp Sci Inst, Berkeley, CA 94704 USA
[2] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA USA
来源
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年
关键词
speaker diarization; cluster purification; temporal; smoothing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The focus of this work is to improve the speaker diarization error rate, and more specifically the speaker error rate. We investigate two methods of improving the speaker error rate: modifying the minimum duration constraint and incorporating novel purification techniques. First, in the final step of the speaker diarization algorithm we replace the minimum duration constraint with a simple smoothing algorithm, which averages the log -likelihoods for each of the hypothesized speakers. This method improves the speaker error rate by 12% relative for the MDM condition. Second, we utilize the difference between the largest and second largest log -likelihoods to identify frames which are believed to be correct (or "pure"). The difference value is shown be more effective at separating correct frames from incorrect frames than the previously used maximum log-likelihood value. Using only the "pure" frames, the cluster models are retrained and segmentation is performed using the above mentioned smoothing technique. The proposed purification and smoothing reduces the speaker error rate over the baseline; however, it is worse than performing the smoothing step alone.
引用
收藏
页码:2782 / 2786
页数:5
相关论文
共 50 条
  • [1] Speaker Diarization: A Review of Objectives and Methods
    O'Shaughnessy, Douglas
    APPLIED SCIENCES-BASEL, 2025, 15 (04):
  • [2] SPEAKER DIARIZATION THROUGH SPEAKER EMBEDDINGS
    Rouvier, Mickael
    Bousquet, Pierre-Michel
    Favre, Benoit
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2082 - 2086
  • [3] Improving speaker diarization by cross EM refinement
    Ning, Huazhong
    Xu, Wei
    Gong, Yihong
    Huang, Thomas
    2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, : 1901 - 1904
  • [4] Improving Speaker Diarization for CHIL Lecture Meetings
    Huang, Jing
    Marcheret, Etienne
    Visweswariah, Karthik
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2628 - 2631
  • [5] On the Use of Spectral and Iterative Methods for Speaker Diarization
    Shum, Stephen
    Dehak, Najim
    Glass, Jim
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 482 - 485
  • [6] A SPEAKER REDIARIZATION SCHEME FOR IMPROVING DIARIZATION IN LARGE TWO-SPEAKER TELEPHONE DATASETS
    Ghaemmaghami, Houman
    Dean, David
    Sridharan, Sridha
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1272 - 1276
  • [7] Exploring Speaker-Related Information in Spoken Language Understanding for Better Speaker Diarization
    Cheng, Luyao
    Zheng, Siqi
    Zhang Qinglin
    Wang, Hui
    Chen, Yafeng
    Chen, Qian
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 14068 - 14077
  • [8] Multimodal Speaker Diarization
    Noulas, Athanasios
    Englebienne, Gwenn
    Krose, Ben J. A.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (01) : 79 - 93
  • [9] SPEAKER DIARIZATION WITH LSTM
    Wang, Quan
    Downey, Carlton
    Wan, Li
    Mansfield, Philip Andrew
    Moreno, Ignacio Lopez
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5239 - 5243
  • [10] Trainable Speaker Diarization
    Aronowitz, Hagai
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2021 - 2024