ViVoLAB Speaker Diarization System for the DIHARD 2019 Challenge

被引:3
|
作者
Vinals, Ignacio [1 ]
Gimeno, Pablo [1 ]
Ortega, Alfonso [1 ]
Miguel, Antonio [1 ]
Lleida, Eduardo [1 ]
机构
[1] Univ Zaragoza, Aragon Inst Engn Res I3A, ViVoLab, Zaragoza, Spain
来源
关键词
diarization; DIHARD Challenge; PLDA; Variational Bayes; Tree search; M-algorithm;
D O I
10.21437/Interspeech.2019-2462
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper presents the latest improvements in Speaker Diarization obtained by ViVoLAB research group for the 2019 DIHARD Diarization Challenge. This evaluation seeks the improvement of the diarization task in adverse conditions. For this purpose, the audio recordings involve multiple scenarios with no restrictions in terms of speakers, overlapped speech nor quality of the audio. Our submission follows the traditional segmentation-clustering-resegmentation pipeline: Speaker embeddings are extracted from acoustic segments with a single speaker on them, later clustered by means of a PLDA. Our contribution in this work is focused on the clustering step. We present results with our Variational Bayes PLDA clustering and our tree-based clustering strategy, which sequentially assigns the different embeddings to its corresponding speaker according to a PLDA model. Both strategies compare multiple diarization hypotheses and choose their candidate one according to a generative criterion. We also analyze the impact of the different available embeddings in the state-of-the-art with both clustering approaches.
引用
收藏
页码:988 / 992
页数:5
相关论文
共 50 条
  • [21] Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge
    Sell, Gregory
    Snyder, David
    McCree, Alan
    Garcia-Romero, Daniel
    Villalba, Jesus
    Maciejewski, Matthew
    Manohar, Vimal
    Dehak, Najim
    Povey, Daniel
    Watanabe, Shinji
    Khudanpur, Sanjeev
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2808 - 2812
  • [22] SPEAKER EMBEDDINGS FOR DIARIZATION OF BROADCAST DATA IN THE ALLIES CHALLENGE
    Larcher, Anthony
    Mehrish, Ambuj
    Tahon, Marie
    Meignier, Sylvain
    Carrive, Jean
    Doukhan, David
    Galibert, Olivier
    Evans, Nicholas
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5799 - 5803
  • [23] VARIATIONAL BAYESIAN PLDA FOR SPEAKER DIARIZATION IN THE MGB CHALLENGE
    Villalba, Jesus
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 667 - 674
  • [24] Investigating Various Diarization Algorithms for Speaker in the Wild (SITW) Speaker Recognition Challenge
    Liu, Yi
    Tian, Yao
    He, Liang
    Liu, Jia
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 853 - 857
  • [25] IMPROVED SPEAKER DIARIZATION SYSTEM FOR MEETINGS
    El-Khoury, Elie
    Senac, Christine
    Pinquier, Julien
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4097 - 4100
  • [26] pnf Improvements in speaker diarization system
    Fu, Rong
    Benest, Ian D.
    SIGMAP 2007: PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS, 2007, : 317 - +
  • [27] The JHU Speaker Recognition System for the VOiCES 2019 Challenge
    Snyder, David
    Villalba, Jesus
    Chen, Nanxin
    Povey, Daniel
    Sell, Gregory
    Dehak, Najim
    Khudanpur, Sanjeev
    INTERSPEECH 2019, 2019, : 2468 - 2472
  • [28] Speaker Diarization and Detection System using A Priori Speaker Information
    Kenai, Ouassila
    Asbai, Nassim
    Ouamour, Siham
    Guerti, Mhania
    Djeghiour, Salim
    2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING (ICNLSP), 2018, : 73 - 78
  • [29] The LIA RT'07 speaker diarization system
    Fredouille, Corinne
    Evans, Nicholas
    MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 520 - 532
  • [30] Experiments with Segmentation in an Online Speaker Diarization System
    Kunesova, Marie
    Zajic, Zbynek
    Radova, Vlasta
    TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 429 - 437