Speaker diarization:: Towards a more robust and portable system

被引:0
|
作者
El Khoury, Elie [1 ]
Senac, Christine [1 ]
Andre-Obrecht, Regine [1 ]
机构
[1] CNRS, UMR 5505, IRIT, SAMoVA Team, Toulouse, France
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we describe a new method for speaker segmentation and clustering of an audio document. For the segmentation phase, we combine the Generalized Likelihood Ratio (GLR) and the Bayesian Information Criterion (BIC) in a way that avoids most of the parameters tuning. For the clustering phase, we use an existing approach that utilizes the Eigen Vector Space Model (EVSM) with a bottom-up hierarchical grouping but we make some improvements by introducing prosodic information. Evaluation is done on the audio database of the ESTER evaluation campaign for the rich transcription of French Broadcast news. Results show that our method which operates without any a priori knowledge about speakers is suitable for speaker diarization as it outperforms the traditional ones with an overall Diarization error rate (DER) of 16.72%.
引用
收藏
页码:489 / +
页数:2
相关论文
共 50 条
  • [31] Ideas for Clustering of Similar Models of a Speaker in an Online Speaker Diarization System
    Kunesova, Marie
    Radova, Vlasta
    TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 225 - 233
  • [32] Robust speaker diarization for meetings: ICSI RT06S meetings evaluation system
    Anguera, Xavier
    Wooters, Chuck
    Pardo, Jose M.
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 346 - +
  • [33] Global speaker clustering towards optimal stopping criterion in binary key speaker diarization
    Delgado, Héctor (hector.delgado@uab.cat), 1600, Springer Verlag (8854):
  • [34] Global Speaker Clustering towards Optimal Stopping Criterion in Binary Key Speaker Diarization
    Delgado, Hector
    Anguera, Xavier
    Fredouille, Corinne
    Serrano, Javier
    ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2014, 2014, 8854 : 59 - 68
  • [35] Optimized speaker change detection approach for speaker segmentation towards speaker diarization based on deep learning
    VijayKumar, K.
    Rao, R. Rajeswara
    DATA & KNOWLEDGE ENGINEERING, 2023, 144
  • [36] SPEAKER DIARIZATION THROUGH SPEAKER EMBEDDINGS
    Rouvier, Mickael
    Bousquet, Pierre-Michel
    Favre, Benoit
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2082 - 2086
  • [37] Multimodal Speaker Diarization
    Noulas, Athanasios
    Englebienne, Gwenn
    Krose, Ben J. A.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (01) : 79 - 93
  • [38] SPEAKER DIARIZATION WITH LSTM
    Wang, Quan
    Downey, Carlton
    Wan, Li
    Mansfield, Philip Andrew
    Moreno, Ignacio Lopez
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5239 - 5243
  • [39] The ICSI RT-09 Speaker Diarization System
    Friedland, Gerald
    Janin, Adam
    Imseng, David
    Anguera Miro, Xavier
    Gottlieb, Luke
    Huijbregts, Marijn
    Knox, Mary Tai
    Vinyals, Oriol
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 371 - 381
  • [40] The SAIL Speaker Diarization System for Analysis of Spontaneous Meetings
    Han, Kyu J.
    Georgiou, Panayiotis G.
    Narayanan, Shrikanth S.
    2008 IEEE 10TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, VOLS 1 AND 2, 2008, : 970 - 975