System output combination for improved speaker diarization

被引:0
|
作者
Bozonnet, Simon [1 ]
Evans, Nicholas [1 ]
Anguera, Xavier [2 ]
Vinyals, Oriol
Friedland, Gerald [3 ]
Fredouille, Corinne [4 ]
机构
[1] EURECOM, Sophia Antipolis, France
[2] Telefon Res, Barcelona, Spain
[3] Univ Calif, ICSI, Berkeley, CA USA
[4] Univ Avignon, LIA, Avignon, France
关键词
speaker diarization; system combination; fusion; FEATURES;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
System combination or fusion is a popular, successful and sometimes straightforward means of improving performance in many fields of statistical pattern classification, including speech and speaker recognition. Whilst there is significant work in the literature which aims to improve speaker diarization performance by combining multiple feature streams, there is little work which aims to combine the outputs of multiple systems. This paper reports our first attempts to combine the outputs of two state-of-the-art speaker diarization systems, namely ICSI's bottom-up and LIA-EURECOM's top-down systems. We show that a cluster matching procedure reliably identifies corresponding speaker clusters in the two system outputs and that, when they are used in a new realignment and resegmentation stage, the combination leads to relative improvements of 13% and 7% DER on independent development and evaluation sets.
引用
收藏
页码:2650 / +
页数:2
相关论文
共 50 条
  • [31] Overlapped speech detection for improved speaker diarization in multiparty meetings
    Boakye, Kofi
    Trueba-Hornero, Beatriz
    Vinyals, Oriol
    Friedland, Gerald
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4353 - 4356
  • [32] SPEAKER DIARIZATION THROUGH SPEAKER EMBEDDINGS
    Rouvier, Mickael
    Bousquet, Pierre-Michel
    Favre, Benoit
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2082 - 2086
  • [33] Multimodal Speaker Diarization
    Noulas, Athanasios
    Englebienne, Gwenn
    Krose, Ben J. A.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (01) : 79 - 93
  • [34] SPEAKER DIARIZATION WITH LSTM
    Wang, Quan
    Downey, Carlton
    Wan, Li
    Mansfield, Philip Andrew
    Moreno, Ignacio Lopez
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5239 - 5243
  • [35] The ICSI RT-09 Speaker Diarization System
    Friedland, Gerald
    Janin, Adam
    Imseng, David
    Anguera Miro, Xavier
    Gottlieb, Luke
    Huijbregts, Marijn
    Knox, Mary Tai
    Vinyals, Oriol
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 371 - 381
  • [36] The SAIL Speaker Diarization System for Analysis of Spontaneous Meetings
    Han, Kyu J.
    Georgiou, Panayiotis G.
    Narayanan, Shrikanth S.
    2008 IEEE 10TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, VOLS 1 AND 2, 2008, : 970 - 975
  • [37] Trainable Speaker Diarization
    Aronowitz, Hagai
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2021 - 2024
  • [38] Automatic weighting for the combination of TDOA and acoustic features in speaker diarization for meetings
    Anguera, Xavier
    Wooters, Chuck
    Pardo, Jose M.
    Hernando, Javier
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 241 - +
  • [39] ViVoLAB Speaker Diarization System for the DIHARD 2019 Challenge
    Vinals, Ignacio
    Gimeno, Pablo
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    INTERSPEECH 2019, 2019, : 988 - 992
  • [40] A DOA based speaker diarization system for real meetings
    Araki, Shoko
    Fujimoto, Masakiyo
    Ishizuka, Kentaro
    Sawada, Hiroshi
    Makino, Shoji
    2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS, 2008, : 30 - 33