Neural Network Speaker Descriptor in Speaker Diarization of Telephone Speech

被引：0

作者：

Zajic, Zbynek ^{[1
]}

Zelinka, Jan ^{[1
,2
]}

Mueller, Ludek ^{[1
,2
]}

机构：

[1] Univ West Bohemia, NTIS New Technol Informat Soc, Fac Appl Sci, Univ 8, Plzen 30614, Czech Republic

[2] Univ West Bohemia, Dept Cybernet, Fac Appl Sci, Univ 8, Plzen 30614, Czech Republic

来源：

SPEECH AND COMPUTER, SPECOM 2017 | 2017年 / 10458卷

关键词：

Neural network; Speaker diarization; i-Vector;

D O I：

10.1007/978-3-319-66429-3_55

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we have been investigating an approach to a speaker representation for a diarization system that clusters short telephone conversation segments (produced by the same speaker). The proposed approach applies a neural-network-based descriptor that replaces a usual i-vector descriptor in the state-of-the-art diarization systems. The comparison of these two techniques was done on the English part of the CallHome corpus. The final results indicate the superiority of the i-vector's approach although our proposed descriptor brings an additive information. Thus, the combined descriptor represents a speaker in a segment for diarization purpose with lower diarization error (almost 20% relative improvement compared with only i-vector application).

引用

页码：555 / 563

页数：9

共 50 条

[21] Robust Speaker Diarization for Short Speech Recordings
Imseng, David
Friedland, Gerald
2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 432 - +
[22] Speaker Diarization Experiments for Romanian Parliamentary Speech
Lupu, Eugen
Apatean, Anca
Arsinte, Radu
2015 INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS (ISSCS), 2015,
[23] Speech Enhancement for Multimodal Speaker Diarization System
Ahmad, Rehan
Zubair, Syed
Alquhayz, Hani
IEEE ACCESS, 2020, 8 : 126671 - 126680
[24] Improved Overlapped Speech Handling for Speaker Diarization
Boakye, Kofi
Vinyals, Oriol
Friedland, Gerald
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 948 - +
[25] Triplet Network with Attention for Speaker Diarization
Song, Huan
Willi, Megan
Thiagarajan, Jayaraman J.
Berisha, Visar
Spanias, Andreas
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3608 - 3612
[26] Study on Integration of Speaker Diarization with Speaker Adaptive Speech Recognition for Broadcast Transcription
Silovsky, Jan
Cerva, Petr
Zdansky, Jindrich
Nouza, Jan
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 478 - 481
[27] End-to-End Neural Speaker Diarization with Absolute Speaker Loss
Wang, Chao
Li, Jie
Fang, Xiang
Kang, Jian
Li, Yongxiang
INTERSPEECH 2023, 2023, : 3577 - 3581
[28] SPEAKER DIARIZATION THROUGH SPEAKER EMBEDDINGS
Rouvier, Mickael
Bousquet, Pierre-Michel
Favre, Benoit
2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2082 - 2086
[29] A Triplet Ranking-based Neural Network for Speaker Diarization and Linking
Le Lan, Gael
Charlet, Delphine
Larcher, Anthony
Meignier, Sylvain
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3572 - 3576
[30] ATTENTION-BASED NEURAL NETWORK FOR JOINT DIARIZATION AND SPEAKER EXTRACTION
Chazan, Shlomo E.
Gannot, Sharon
Goldberger, Jacob
2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 301 - 305

← 1 2 3 4 5 →