The Detection of Overlapping Speech with Prosodic Features for Speaker Diarization

被引:0
|
作者
Zelenak, Martin [1 ]
Hernando, Javier [1 ]
机构
[1] Univ Politecn Cataluna, Barcelona, Spain
关键词
overlapping speech detection; prosody; feature selection; speaker diarization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Overlapping speech is responsible for a certain amount of errors produced by standard speaker diarization systems in meeting environment. We are investigating a set of prosody-based long-term features as a potential complement to our overlap detection system relying on short-term spectral parameters. The most relevant features are selected in a two-step process. They are firstly evaluated and sorted according to mRMR criterion and then the optimal number is determined by iterative wrapper approach. We show that the addition of prosodic features decreased overlap detection error. Detected overlap segments are used in speaker diarization to recover missed speech by assigning multiple speaker labels and to increase the purity of speaker clusters.
引用
收藏
页码:1048 / 1051
页数:4
相关论文
共 50 条
  • [31] ARTIFICIAL NEURAL NETWORK FEATURES FOR SPEAKER DIARIZATION
    Yella, Harsha
    Stolcke, Andreas
    Slaney, Malcolm
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 402 - 406
  • [32] Improved Location Features for Meeting Speaker Diarization
    Otterson, Scott
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2928 - 2931
  • [33] Modulation Spectrogram Features for Improved Speaker Diarization
    Vinyals, Oriol
    Friedland, Gerald
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 630 - +
  • [34] Entropy Based Overlapped Speech Detection as a Pre-Processing Stage for Speaker Diarization
    Ben-Harush, Oshry
    Lapidot, Itshak
    Guterman, Hugo
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 908 - +
  • [35] Robust prosodic features for speaker identification
    Carey, MJ
    Parris, ES
    LloydThomas, H
    Bennett, S
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1800 - 1803
  • [36] Convolutional Neural Network Architectures for Gender, Emotional Detection from Speech and Speaker Diarization
    Taha T.M.
    Messaoud Z.B.
    Frikha M.
    International Journal of Interactive Mobile Technologies, 2024, 18 (03): : 88 - 103
  • [37] Speech Activity Detection Under Adverse Conditions Using Neural Networks and Speaker Diarization
    Ulgen, Ismail Rasim
    Saraclar, Murat
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [38] Speaker Diarization with Enhancing Speech for the First DIHARD Challenge
    Sun, Lei
    Du, Jun
    Jiang, Chao
    Zhang, Xueyang
    He, Shan
    Yin, Bing
    Lee, Chin-Hui
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2793 - 2797
  • [39] Study on Integration of Speaker Diarization with Speaker Adaptive Speech Recognition for Broadcast Transcription
    Silovsky, Jan
    Cerva, Petr
    Zdansky, Jindrich
    Nouza, Jan
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 478 - 481
  • [40] SPEAKER CHANGE DETECTION AND SPEAKER DIARIZATION USING SPATIAL INFORMATION
    Hu, Mathieu
    Sharma, Dushyant
    Doclo, Simon
    Brookes, Mike
    Naylor, Patrick A.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5743 - 5747