Neural Network Speaker Descriptor in Speaker Diarization of Telephone Speech

被引:0
|
作者
Zajic, Zbynek [1 ]
Zelinka, Jan [1 ,2 ]
Mueller, Ludek [1 ,2 ]
机构
[1] Univ West Bohemia, NTIS New Technol Informat Soc, Fac Appl Sci, Univ 8, Plzen 30614, Czech Republic
[2] Univ West Bohemia, Dept Cybernet, Fac Appl Sci, Univ 8, Plzen 30614, Czech Republic
来源
关键词
Neural network; Speaker diarization; i-Vector;
D O I
10.1007/978-3-319-66429-3_55
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we have been investigating an approach to a speaker representation for a diarization system that clusters short telephone conversation segments (produced by the same speaker). The proposed approach applies a neural-network-based descriptor that replaces a usual i-vector descriptor in the state-of-the-art diarization systems. The comparison of these two techniques was done on the English part of the CallHome corpus. The final results indicate the superiority of the i-vector's approach although our proposed descriptor brings an additive information. Thus, the combined descriptor represents a speaker in a segment for diarization purpose with lower diarization error (almost 20% relative improvement compared with only i-vector application).
引用
收藏
页码:555 / 563
页数:9
相关论文
共 50 条
  • [21] Robust Speaker Diarization for Short Speech Recordings
    Imseng, David
    Friedland, Gerald
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 432 - +
  • [22] Speaker Diarization Experiments for Romanian Parliamentary Speech
    Lupu, Eugen
    Apatean, Anca
    Arsinte, Radu
    2015 INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS (ISSCS), 2015,
  • [23] Speech Enhancement for Multimodal Speaker Diarization System
    Ahmad, Rehan
    Zubair, Syed
    Alquhayz, Hani
    IEEE ACCESS, 2020, 8 : 126671 - 126680
  • [24] Improved Overlapped Speech Handling for Speaker Diarization
    Boakye, Kofi
    Vinyals, Oriol
    Friedland, Gerald
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 948 - +
  • [25] Triplet Network with Attention for Speaker Diarization
    Song, Huan
    Willi, Megan
    Thiagarajan, Jayaraman J.
    Berisha, Visar
    Spanias, Andreas
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3608 - 3612
  • [26] Study on Integration of Speaker Diarization with Speaker Adaptive Speech Recognition for Broadcast Transcription
    Silovsky, Jan
    Cerva, Petr
    Zdansky, Jindrich
    Nouza, Jan
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 478 - 481
  • [27] End-to-End Neural Speaker Diarization with Absolute Speaker Loss
    Wang, Chao
    Li, Jie
    Fang, Xiang
    Kang, Jian
    Li, Yongxiang
    INTERSPEECH 2023, 2023, : 3577 - 3581
  • [28] SPEAKER DIARIZATION THROUGH SPEAKER EMBEDDINGS
    Rouvier, Mickael
    Bousquet, Pierre-Michel
    Favre, Benoit
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2082 - 2086
  • [29] A Triplet Ranking-based Neural Network for Speaker Diarization and Linking
    Le Lan, Gael
    Charlet, Delphine
    Larcher, Anthony
    Meignier, Sylvain
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3572 - 3576
  • [30] ATTENTION-BASED NEURAL NETWORK FOR JOINT DIARIZATION AND SPEAKER EXTRACTION
    Chazan, Shlomo E.
    Gannot, Sharon
    Goldberger, Jacob
    2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 301 - 305