Modulation Spectrogram Features for Improved Speaker Diarization

被引:0
|
作者
Vinyals, Oriol [1 ]
Friedland, Gerald [1 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
关键词
modulation spectrogram; speaker diarization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose the use of modulation spectrogram features in speaker diarization. These features carry longer term characteristics of the acoustic signals than the widely used MFCCs, thus providing potential improvement by using both features in combination. Using the state-of-the-art ICSI speaker diarization system, an improvement of 20.77% relative DER is obtained on the MIST Rich Transcription 2007 task with respect to the MFCC only system.
引用
收藏
页码:630 / +
页数:2
相关论文
共 50 条
  • [31] MULTI-CHANNEL SPEAKER DIARIZATION USING SPATIAL FEATURES FOR MEETINGS
    Zheng, Naijun
    Li, Na
    Yu, JianWei
    Weng, Chao
    Su, Dan
    Liu, XunYing
    Meng, Helen
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7337 - 7341
  • [32] Multistream speaker diarization of meetings recordings beyond MFCC and TDOA features
    Vijayasenan, Deepu
    Valente, Fabio
    Bourlard, Herve
    SPEECH COMMUNICATION, 2012, 54 (01) : 55 - 67
  • [33] Automatic weighting for the combination of TDOA and acoustic features in speaker diarization for meetings
    Anguera, Xavier
    Wooters, Chuck
    Pardo, Jose M.
    Hernando, Javier
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 241 - +
  • [34] Integration of TDOA Features in Information Bottleneck Framework for Fast Speaker Diarization
    Vijayasenan, Deepu
    Valente, Fabio
    Bourland, Herve
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 40 - 43
  • [35] Speaker Diarization Features: The UPM Contribution to the RT09 Evaluation
    Pardo, Jose M.
    Barra-Chicote, Roberto
    San-Segundo, Ruben
    de Cordoba, Ricardo
    Martinez-Gonzalez, Beatriz
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 426 - 435
  • [36] Information Bottleneck Features for HMM/GMM Speaker Diarization of Meetings Recordings
    Yella, Sree Harsha
    Valente, Fabio
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 960 - 963
  • [37] LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization
    Parthasarathi, Hari Krishnan
    Bourlard, Herve
    Gatica-Perez, Daniel
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1052 - 1055
  • [38] TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
    Pang, Bowen
    Zhao, Huan
    Zhang, Gaosheng
    Yang, Xiaoyue
    Sun, Yang
    Zhang, Li
    Wang, Qing
    Xie, Lei
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 502 - 506
  • [39] New Advances in Speaker Diarization
    Aronowitz, Hagai
    Zhu, Weizhong
    Suzuki, Masayuki
    Kurata, Gakuto
    Hoory, Ron
    INTERSPEECH 2020, 2020, : 279 - 283
  • [40] Spectrogram Features-Based Automatic Speaker Identification For Smart Services
    Jahangir, Rashid
    Alreshoodi, Mohammed
    Khaled Alarfaj, Fawaz
    APPLIED ARTIFICIAL INTELLIGENCE, 2025, 39 (01)