System output combination for improved speaker diarization

被引：0

作者：

Bozonnet, Simon ^{[1
]}

Evans, Nicholas ^{[1
]}

Anguera, Xavier ^{[2
]}

Vinyals, Oriol

Friedland, Gerald ^{[3
]}

Fredouille, Corinne ^{[4
]}

机构：

[1] EURECOM, Sophia Antipolis, France

[2] Telefon Res, Barcelona, Spain

[3] Univ Calif, ICSI, Berkeley, CA USA

[4] Univ Avignon, LIA, Avignon, France

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年

关键词：

speaker diarization; system combination; fusion; FEATURES;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

System combination or fusion is a popular, successful and sometimes straightforward means of improving performance in many fields of statistical pattern classification, including speech and speaker recognition. Whilst there is significant work in the literature which aims to improve speaker diarization performance by combining multiple feature streams, there is little work which aims to combine the outputs of multiple systems. This paper reports our first attempts to combine the outputs of two state-of-the-art speaker diarization systems, namely ICSI's bottom-up and LIA-EURECOM's top-down systems. We show that a cluster matching procedure reliably identifies corresponding speaker clusters in the two system outputs and that, when they are used in a new realignment and resegmentation stage, the combination leads to relative improvements of 13% and 7% DER on independent development and evaluation sets.

引用

页码：2650 / +

页数：2

共 50 条

[31] Overlapped speech detection for improved speaker diarization in multiparty meetings
Boakye, Kofi
Trueba-Hornero, Beatriz
Vinyals, Oriol
Friedland, Gerald
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4353 - 4356
[32] SPEAKER DIARIZATION THROUGH SPEAKER EMBEDDINGS
Rouvier, Mickael
Bousquet, Pierre-Michel
Favre, Benoit
2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2082 - 2086
[33] Multimodal Speaker Diarization
Noulas, Athanasios
Englebienne, Gwenn
Krose, Ben J. A.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (01) : 79 - 93
[34] SPEAKER DIARIZATION WITH LSTM
Wang, Quan
Downey, Carlton
Wan, Li
Mansfield, Philip Andrew
Moreno, Ignacio Lopez
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5239 - 5243
[35] The ICSI RT-09 Speaker Diarization System
Friedland, Gerald
Janin, Adam
Imseng, David
Anguera Miro, Xavier
Gottlieb, Luke
Huijbregts, Marijn
Knox, Mary Tai
Vinyals, Oriol
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 371 - 381
[36] The SAIL Speaker Diarization System for Analysis of Spontaneous Meetings
Han, Kyu J.
Georgiou, Panayiotis G.
Narayanan, Shrikanth S.
2008 IEEE 10TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, VOLS 1 AND 2, 2008, : 970 - 975
[37] Trainable Speaker Diarization
Aronowitz, Hagai
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2021 - 2024
[38] Automatic weighting for the combination of TDOA and acoustic features in speaker diarization for meetings
Anguera, Xavier
Wooters, Chuck
Pardo, Jose M.
Hernando, Javier
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 241 - +
[39] ViVoLAB Speaker Diarization System for the DIHARD 2019 Challenge
Vinals, Ignacio
Gimeno, Pablo
Ortega, Alfonso
Miguel, Antonio
Lleida, Eduardo
INTERSPEECH 2019, 2019, : 988 - 992
[40] A DOA based speaker diarization system for real meetings
Araki, Shoko
Fujimoto, Masakiyo
Ishizuka, Kentaro
Sawada, Hiroshi
Makino, Shoji
2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS, 2008, : 30 - 33

← 1 2 3 4 5 →