System output combination for improved speaker diarization

被引:0
|
作者
Bozonnet, Simon [1 ]
Evans, Nicholas [1 ]
Anguera, Xavier [2 ]
Vinyals, Oriol
Friedland, Gerald [3 ]
Fredouille, Corinne [4 ]
机构
[1] EURECOM, Sophia Antipolis, France
[2] Telefon Res, Barcelona, Spain
[3] Univ Calif, ICSI, Berkeley, CA USA
[4] Univ Avignon, LIA, Avignon, France
关键词
speaker diarization; system combination; fusion; FEATURES;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
System combination or fusion is a popular, successful and sometimes straightforward means of improving performance in many fields of statistical pattern classification, including speech and speaker recognition. Whilst there is significant work in the literature which aims to improve speaker diarization performance by combining multiple feature streams, there is little work which aims to combine the outputs of multiple systems. This paper reports our first attempts to combine the outputs of two state-of-the-art speaker diarization systems, namely ICSI's bottom-up and LIA-EURECOM's top-down systems. We show that a cluster matching procedure reliably identifies corresponding speaker clusters in the two system outputs and that, when they are used in a new realignment and resegmentation stage, the combination leads to relative improvements of 13% and 7% DER on independent development and evaluation sets.
引用
收藏
页码:2650 / +
页数:2
相关论文
共 50 条
  • [41] Speaker diarization:: Towards a more robust and portable system
    El Khoury, Elie
    Senac, Christine
    Andre-Obrecht, Regine
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 489 - +
  • [42] Post-processing techniques for a speaker diarization system
    Tavarez, David
    Navas, Eva
    Erro, Daniel
    Saratxaga, Ibon
    Hernaez, Inma
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2012, (49): : 109 - 115
  • [43] Progress in the AMIDA speaker diarization system for meeting data
    van Leeuwen, David A.
    Konecny, Matej
    MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 475 - 483
  • [44] SPHEREDIAR: AN EFFECTIVE SPEAKER DIARIZATION SYSTEM FOR MEETING DATA
    Kaseva, Tuomas
    Rouhe, Aku
    Kurimo, Mikko
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 373 - 380
  • [45] CONVOLUTIONAL NEURAL NETWORK FOR SPEAKER CHANGE DETECTION IN TELEPHONE SPEAKER DIARIZATION SYSTEM
    Hruz, Marek
    Zajic, Zbynek
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4945 - 4949
  • [46] Speaker Diarization for ASR Output with T-vectors: A Sequence Classification Approach
    Yousefi, Midia
    Kanda, Naoyuki
    Wang, Dongmei
    Chen, Zhuo
    Wang, Xiaofei
    Yoshioka, Takuya
    INTERSPEECH 2023, 2023, : 3502 - 3506
  • [47] FUSING SHORT TERM AND LONG TERM FEATURES FOR IMPROVED SPEAKER DIARIZATION
    Friedland, A. Gerald
    Vinyals, B. Oriol
    Huang, C. Yan
    Mueller, D. Christian
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4077 - +
  • [48] New Advances in Speaker Diarization
    Aronowitz, Hagai
    Zhu, Weizhong
    Suzuki, Masayuki
    Kurata, Gakuto
    Hoory, Ron
    INTERSPEECH 2020, 2020, : 279 - 283
  • [49] WHERE ARE THE CHALLENGES IN SPEAKER DIARIZATION?
    Sinclair, Mark
    King, Simon
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7741 - 7745
  • [50] Speaker diarization system using HXLPS and deep neural network
    Ramaiah, V. Subba
    Rao, R. Rajeswara
    ALEXANDRIA ENGINEERING JOURNAL, 2018, 57 (01) : 255 - 266