System output combination for improved speaker diarization

被引：0

作者：

Bozonnet, Simon ^{[1
]}

Evans, Nicholas ^{[1
]}

Anguera, Xavier ^{[2
]}

Vinyals, Oriol

Friedland, Gerald ^{[3
]}

Fredouille, Corinne ^{[4
]}

机构：

[1] EURECOM, Sophia Antipolis, France

[2] Telefon Res, Barcelona, Spain

[3] Univ Calif, ICSI, Berkeley, CA USA

[4] Univ Avignon, LIA, Avignon, France

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年

关键词：

speaker diarization; system combination; fusion; FEATURES;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

System combination or fusion is a popular, successful and sometimes straightforward means of improving performance in many fields of statistical pattern classification, including speech and speaker recognition. Whilst there is significant work in the literature which aims to improve speaker diarization performance by combining multiple feature streams, there is little work which aims to combine the outputs of multiple systems. This paper reports our first attempts to combine the outputs of two state-of-the-art speaker diarization systems, namely ICSI's bottom-up and LIA-EURECOM's top-down systems. We show that a cluster matching procedure reliably identifies corresponding speaker clusters in the two system outputs and that, when they are used in a new realignment and resegmentation stage, the combination leads to relative improvements of 13% and 7% DER on independent development and evaluation sets.

引用

页码：2650 / +

页数：2

共 50 条

[41] Speaker diarization:: Towards a more robust and portable system
El Khoury, Elie
Senac, Christine
Andre-Obrecht, Regine
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 489 - +
[42] Post-processing techniques for a speaker diarization system
Tavarez, David
Navas, Eva
Erro, Daniel
Saratxaga, Ibon
Hernaez, Inma
PROCESAMIENTO DEL LENGUAJE NATURAL, 2012, (49): : 109 - 115
[43] Progress in the AMIDA speaker diarization system for meeting data
van Leeuwen, David A.
Konecny, Matej
MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 475 - 483
[44] SPHEREDIAR: AN EFFECTIVE SPEAKER DIARIZATION SYSTEM FOR MEETING DATA
Kaseva, Tuomas
Rouhe, Aku
Kurimo, Mikko
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 373 - 380
[45] CONVOLUTIONAL NEURAL NETWORK FOR SPEAKER CHANGE DETECTION IN TELEPHONE SPEAKER DIARIZATION SYSTEM
Hruz, Marek
Zajic, Zbynek
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4945 - 4949
[46] Speaker Diarization for ASR Output with T-vectors: A Sequence Classification Approach
Yousefi, Midia
Kanda, Naoyuki
Wang, Dongmei
Chen, Zhuo
Wang, Xiaofei
Yoshioka, Takuya
INTERSPEECH 2023, 2023, : 3502 - 3506
[47] FUSING SHORT TERM AND LONG TERM FEATURES FOR IMPROVED SPEAKER DIARIZATION
Friedland, A. Gerald
Vinyals, B. Oriol
Huang, C. Yan
Mueller, D. Christian
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4077 - +
[48] New Advances in Speaker Diarization
Aronowitz, Hagai
Zhu, Weizhong
Suzuki, Masayuki
Kurata, Gakuto
Hoory, Ron
INTERSPEECH 2020, 2020, : 279 - 283
[49] WHERE ARE THE CHALLENGES IN SPEAKER DIARIZATION?
Sinclair, Mark
King, Simon
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7741 - 7745
[50] Speaker diarization system using HXLPS and deep neural network
Ramaiah, V. Subba
Rao, R. Rajeswara
ALEXANDRIA ENGINEERING JOURNAL, 2018, 57 (01) : 255 - 266

← 1 2 3 4 5 →