Phone Adaptive Training for Speaker Diarization

被引:0
|
作者
Bozonnet, Simon [1 ]
Vipperla, Ravichander [1 ]
Evans, Nicholas [1 ]
机构
[1] EURECOM, F-06904 Sophia Antipolis, France
关键词
Speaker Diarization; Phone Adaptive Training; Speaker Discrimination;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The linguistic content of a speech signal is a source of unwanted variation which can degrade speaker diarization performance. This paper presents our latest work to reduce its impact. The new approach, referred to as Phone Adaptive Training (PAT), is analogous to speaker adaptive training used in automatic speech recognition. We report an oracle experiment which shows that PAT has the potential to deliver a 33% relative improvement in the diarization error rate over our baseline system. Practical experiments show significant improvements across two standard, independent evaluation datasets.
引用
收藏
页码:494 / 497
页数:4
相关论文
共 50 条
  • [41] EMBEDDINGS FOR DNN SPEAKER ADAPTIVE TRAINING
    Rownicka, Joanna
    Bell, Peter
    Renals, Steve
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 479 - 486
  • [42] A review on speaker diarization systems and approaches
    Moattar, M. H.
    Homayounpour, M. M.
    SPEECH COMMUNICATION, 2012, 54 (10) : 1065 - 1103
  • [43] Speaker Diarization: A Review of Recent Research
    Anguera Miro, Xavier
    Bozonnet, Simon
    Evans, Nicholas
    Fredouille, Corinne
    Friedland, Gerald
    Vinyals, Oriol
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 356 - 370
  • [44] SPEAKER DIARIZATION WITH REGION PROPOSAL NETWORK
    Huang, Zili
    Watanabe, Shinji
    Fujita, Yusuke
    Garcia, Paola
    Shao, Yiwen
    Povey, Daniel
    Khudanpur, Sanjeev
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6514 - 6518
  • [45] AUDIOVISUAL SPEAKER DIARIZATION OF TV SERIES
    Bost, Xavier
    Linares, Georges
    Gueye, Serigne
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4799 - 4803
  • [46] Speaker diarization of French broadcast news
    Gupta, Vishwa
    Boulianne, Gilles
    Kenny, Patrick
    Ouellet, Pierre
    Dumouchel, Pierre
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4365 - 4368
  • [47] A Hybrid Approach to Online Speaker Diarization
    Vaquero, Carlos
    Vinyals, Oriol
    Friedland, Gerald
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2646 - +
  • [48] Acoustic beamforming for speaker diarization of meetings
    Anguera, Xavier
    Wooters, Chuck
    Hernando, Javier
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (07): : 2011 - 2022
  • [49] Speaker adaptive training: A maximum likelihood approach to speaker normalization
    Anastasakos, T
    McDonough, J
    Makhoul, J
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1043 - 1046
  • [50] Neural Network Speaker Descriptor in Speaker Diarization of Telephone Speech
    Zajic, Zbynek
    Zelinka, Jan
    Mueller, Ludek
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 555 - 563