Speaker diarization:: Towards a more robust and portable system

被引:0
|
作者
El Khoury, Elie [1 ]
Senac, Christine [1 ]
Andre-Obrecht, Regine [1 ]
机构
[1] CNRS, UMR 5505, IRIT, SAMoVA Team, Toulouse, France
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we describe a new method for speaker segmentation and clustering of an audio document. For the segmentation phase, we combine the Generalized Likelihood Ratio (GLR) and the Bayesian Information Criterion (BIC) in a way that avoids most of the parameters tuning. For the clustering phase, we use an existing approach that utilizes the Eigen Vector Space Model (EVSM) with a bottom-up hierarchical grouping but we make some improvements by introducing prosodic information. Evaluation is done on the audio database of the ESTER evaluation campaign for the rich transcription of French Broadcast news. Results show that our method which operates without any a priori knowledge about speakers is suitable for speaker diarization as it outperforms the traditional ones with an overall Diarization error rate (DER) of 16.72%.
引用
收藏
页码:489 / +
页数:2
相关论文
共 50 条
  • [1] Automatic cluster complexity and quantity selection: Towards robust speaker diarization
    Anguera, Xavier
    Wooters, Chuck
    Hernando, Javier
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 248 - +
  • [2] Robust Speaker Diarization for News Broadcast
    Karthik, M. L. N. S.
    Ganesh, Mirishkar Sai
    Patnaik, Bijayananda
    2018 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2018,
  • [3] Towards a complete Binary Key System for the Speaker Diarization Task
    Delgado, Hector
    Fredouille, Corinne
    Serrano, Javier
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 572 - 576
  • [4] A Robust Stopping Criterion for Agglomerative Hierarchical Clustering in a Speaker Diarization System
    Han, Kyu J.
    Narayanan, Shrikanth S.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1005 - 1008
  • [5] An Improved Speaker Diarization System
    Fu, Rong
    Benest, Ian D.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1253 - 1256
  • [6] Harmonic Structure Features for Robust Speaker Diarization
    Zhou, Yu
    Suo, Hongbin
    Li, Junfeng
    Yan, Yonghong
    ETRI JOURNAL, 2012, 34 (04) : 583 - 590
  • [7] Robust Speaker Diarization for Short Speech Recordings
    Imseng, David
    Friedland, Gerald
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 432 - +
  • [8] Towards lifelong human assisted speaker diarization
    Shamsi, Meysam
    Larcher, Anthony
    Barrault, Loic
    Meignier, Sylvain
    Prokopalo, Yevheni
    Tahon, Marie
    Mehrish, Ambuj
    Petitrenaud, Simon
    Galibert, Olivier
    Gaist, Samuel
    Anjos, Andre
    Marcel, Sebastien
    Costa-jussa, Marta R.
    COMPUTER SPEECH AND LANGUAGE, 2022, 77
  • [9] TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
    Pang, Bowen
    Zhao, Huan
    Zhang, Gaosheng
    Yang, Xiaoyue
    Sun, Yang
    Zhang, Li
    Wang, Qing
    Xie, Lei
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 502 - 506
  • [10] IMPROVED SPEAKER DIARIZATION SYSTEM FOR MEETINGS
    El-Khoury, Elie
    Senac, Christine
    Pinquier, Julien
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4097 - 4100