Speaker diarization:: Towards a more robust and portable system

被引:0
|
作者
El Khoury, Elie [1 ]
Senac, Christine [1 ]
Andre-Obrecht, Regine [1 ]
机构
[1] CNRS, UMR 5505, IRIT, SAMoVA Team, Toulouse, France
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we describe a new method for speaker segmentation and clustering of an audio document. For the segmentation phase, we combine the Generalized Likelihood Ratio (GLR) and the Bayesian Information Criterion (BIC) in a way that avoids most of the parameters tuning. For the clustering phase, we use an existing approach that utilizes the Eigen Vector Space Model (EVSM) with a bottom-up hierarchical grouping but we make some improvements by introducing prosodic information. Evaluation is done on the audio database of the ESTER evaluation campaign for the rich transcription of French Broadcast news. Results show that our method which operates without any a priori knowledge about speakers is suitable for speaker diarization as it outperforms the traditional ones with an overall Diarization error rate (DER) of 16.72%.
引用
收藏
页码:489 / +
页数:2
相关论文
共 50 条
  • [41] Trainable Speaker Diarization
    Aronowitz, Hagai
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2021 - 2024
  • [42] TRANSFER LEARNING USING RAW WAVEFORM SINCNET FOR ROBUST SPEAKER DIARIZATION
    Dubey, Harishchandra
    Sangwan, Abhijeet
    Hansen, John H. L.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6296 - 6300
  • [43] Robust End-to-end Speaker Diarization with Generic Neural Clustering
    Yang, Chenyu
    Wang, Yu
    INTERSPEECH 2022, 2022, : 1471 - 1475
  • [44] LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization
    Parthasarathi, Hari Krishnan
    Bourlard, Herve
    Gatica-Perez, Daniel
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1052 - 1055
  • [45] ViVoLAB Speaker Diarization System for the DIHARD 2019 Challenge
    Vinals, Ignacio
    Gimeno, Pablo
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    INTERSPEECH 2019, 2019, : 988 - 992
  • [46] A DOA based speaker diarization system for real meetings
    Araki, Shoko
    Fujimoto, Masakiyo
    Ishizuka, Kentaro
    Sawada, Hiroshi
    Makino, Shoji
    2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS, 2008, : 30 - 33
  • [47] Post-processing techniques for a speaker diarization system
    Tavarez, David
    Navas, Eva
    Erro, Daniel
    Saratxaga, Ibon
    Hernaez, Inma
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2012, (49): : 109 - 115
  • [48] Progress in the AMIDA speaker diarization system for meeting data
    van Leeuwen, David A.
    Konecny, Matej
    MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 475 - 483
  • [49] SPHEREDIAR: AN EFFECTIVE SPEAKER DIARIZATION SYSTEM FOR MEETING DATA
    Kaseva, Tuomas
    Rouhe, Aku
    Kurimo, Mikko
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 373 - 380
  • [50] CONVOLUTIONAL NEURAL NETWORK FOR SPEAKER CHANGE DETECTION IN TELEPHONE SPEAKER DIARIZATION SYSTEM
    Hruz, Marek
    Zajic, Zbynek
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4945 - 4949