SPEAKER EMBEDDINGS INCORPORATING ACOUSTIC CONDITIONS FOR DIARIZATION

被引:0
|
作者
Higuchi, Yosuke [1 ,2 ]
Suzuki, Masayuki [1 ]
Kurata, Gakuto [1 ]
机构
[1] IBM Res AI, Tokyo, Japan
[2] Waseda Univ, Dept Commun & Comp Engn, Tokyo, Japan
关键词
speaker embedding; speaker diarization; representation learning; neural network;
D O I
10.1109/icassp40776.2020.9054273
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present our work on training speaker embeddings, especially effective for speaker diarization. For various speaker recognition tasks, extracting speaker embeddings using Deep Neural Networks (DNNs) has become major methods. These embeddings are generally trained to be discriminate speakers and be robust with respect to different acoustic conditions. In speaker diarization, however, the acoustic conditions can be used as consistent information for discriminating speakers. Such information can include the distances to a microphone in a meeting, or the channels for each speaker in telephone conversation recorded in monaural. Hence, the proposed speaker-embedding network leverages differences in acoustic conditions to train effective speaker embeddings for speaker diarization. The information on acoustic conditions can be anything that contributes to distinguishing between recording environments; for example, we explore using i-vectors. Experiments conducted on a practical diarization system demonstrated that the proposed embeddings significantly improve performance over embeddings without information on acoustic conditions.
引用
收藏
页码:7129 / 7133
页数:5
相关论文
共 50 条
  • [41] Bayes Factor Based Speaker Segmentation for Speaker Diarization
    Speech and Audio Research Laboratory, Queensland University of Technology, Brisbane, Australia
    Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH, (1405-1408):
  • [42] Bayes Factor Based Speaker Segmentation for Speaker Diarization
    Wang, D.
    Vogt, R.
    Sridharan, S.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1405 - 1408
  • [43] Factor Analysis for Speaker Segmentation and Improved Speaker Diarization
    Desplanques, Brecht
    Demuynck, Kris
    Martens, Jean-Pierre
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3081 - 3085
  • [44] Online Neural Speaker Diarization With Target Speaker Tracking
    Wang, Weiqing
    Li, Ming
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 5078 - 5091
  • [45] Exploring methods of improving speaker accuracy for speaker diarization
    Knox, Mary Tai
    Mirghafori, Nikki
    Friedland, Gerald
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2782 - 2786
  • [46] Speaker Diarization and Linking of Meeting Data
    Ferras, Marc
    Madikeri, Srikanth
    Bourlard, Herve
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 1935 - 1945
  • [47] Phone Adaptive Training for Speaker Diarization
    Bozonnet, Simon
    Vipperla, Ravichander
    Evans, Nicholas
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 494 - 497
  • [48] Speaker Diarization Using Gesture and Speech
    Gebre, Binyam Gebrekidan
    Wittenburg, Peter
    Drude, Sebastian
    Huijbregts, Marijn
    Heskes, Tom
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 582 - 586
  • [49] END-TO-END DIARIZATION FOR VARIABLE NUMBER OF SPEAKERS WITH LOCAL-GLOBAL NETWORKS AND DISCRIMINATIVE SPEAKER EMBEDDINGS
    Maiti, Soumi
    Erdogan, Hakan
    Wilson, Kevin
    Wisdom, Scott
    Watanabe, Shinji
    Hershey, John R.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7183 - 7187
  • [50] Improving Unsupervised Acoustic Word Embeddings using Speaker and Gender Information
    van Staden, Lisa
    Kamper, Herman
    2020 INTERNATIONAL SAUPEC/ROBMECH/PRASA CONFERENCE, 2020, : 533 - 538