Modulation Spectrogram Features for Improved Speaker Diarization

被引:0
|
作者
Vinyals, Oriol [1 ]
Friedland, Gerald [1 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
关键词
modulation spectrogram; speaker diarization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose the use of modulation spectrogram features in speaker diarization. These features carry longer term characteristics of the acoustic signals than the widely used MFCCs, thus providing potential improvement by using both features in combination. Using the state-of-the-art ICSI speaker diarization system, an improvement of 20.77% relative DER is obtained on the MIST Rich Transcription 2007 task with respect to the MFCC only system.
引用
收藏
页码:630 / +
页数:2
相关论文
共 50 条
  • [41] WHERE ARE THE CHALLENGES IN SPEAKER DIARIZATION?
    Sinclair, Mark
    King, Simon
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7741 - 7745
  • [42] SPEAKER DIARIZATION IN MEETING AUDIO
    Nwe, Tin Lay
    Sun, Hanwu
    Li, Haizhou
    Rahardja, Susanto
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4073 - 4076
  • [43] FULLY SUPERVISED SPEAKER DIARIZATION
    Zhang, Aonan
    Wang, Quan
    Zhu, Zhenyao
    Paisley, John
    Wang, Chong
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6301 - 6305
  • [44] Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives
    Cerva, Petr
    Silovsky, Jan
    Zdansky, Jindrich
    Nouza, Jan
    Seps, Ladislav
    SPEECH COMMUNICATION, 2013, 55 (10) : 1033 - 1046
  • [45] Speaker Diarization with Lexical Information
    Park, Tae Jin
    Han, Kyu J.
    Huang, Jing
    He, Xiaodong
    Zhou, Bowen
    Georgiou, Panayiotis
    Narayanan, Shrikanth
    INTERSPEECH 2019, 2019, : 391 - 395
  • [46] Detecting individual role using features extracted from speaker diarization results
    Benjamin Bigot
    Isabelle Ferrané
    Julien Pinquier
    Régine André-Obrecht
    Multimedia Tools and Applications, 2012, 60 : 347 - 369
  • [47] Detecting individual role using features extracted from speaker diarization results
    Bigot, Benjamin
    Ferrane, Isabelle
    Pinquier, Julien
    Andre-Obrecht, Regine
    MULTIMEDIA TOOLS AND APPLICATIONS, 2012, 60 (02) : 347 - 369
  • [48] Using Voice-quality Measurements with Prosodic and Spectral Features for Speaker Diarization
    Woubie, Abraham
    Luque, Jordi
    Hernando, Javier
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3100 - 3104
  • [49] Speaker count: a new building block for speaker diarization
    Duong, Thanh Thi-Hien
    Nguyen, Phi-Le
    Nguyen, Hong-Son
    Nguyen, Duc-Chien
    Phan, Huy
    Duong, Ngoc Q. K.
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1149 - 1155
  • [50] Obstruent Classification Using Modulation Spectrogram Based Features
    Chittora, Anshu
    Malde, Kewal D.
    Pati, Hemant A.
    2014 17TH ORIENTAL CHAPTER OF THE INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDIZATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (COCOSDA), 2014,