Modulation Spectrogram Features for Improved Speaker Diarization

被引：0

作者：

Vinyals, Oriol ^{[1
]}

Friedland, Gerald ^{[1
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

来源：

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年

关键词：

modulation spectrogram; speaker diarization;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose the use of modulation spectrogram features in speaker diarization. These features carry longer term characteristics of the acoustic signals than the widely used MFCCs, thus providing potential improvement by using both features in combination. Using the state-of-the-art ICSI speaker diarization system, an improvement of 20.77% relative DER is obtained on the MIST Rich Transcription 2007 task with respect to the MFCC only system.

引用

页码：630 / +

页数：2

共 50 条

[21] AN ADAPTIVE INITIALIZATION METHOD FOR SPEAKER DIARIZATION BASED ON PROSODIC FEATURES
Imseng, David
Friedland, Gerald
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4946 - 4949
[22] Prosodic and other Long-Term Features for Speaker Diarization
Friedland, Gerald
Vinyals, Oriol
Huang, Yan
Mueller, Christian
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (05): : 985 - 993
[23] Overlapped speech detection for improved speaker diarization in multiparty meetings
Boakye, Kofi
Trueba-Hornero, Beatriz
Vinyals, Oriol
Friedland, Gerald
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4353 - 4356
[24] Overlap Detection for Speaker Diarization by Fusing Spectral and Spatial Features
Zelenak, Martin
Segura, Carlos
Hernando, Javier
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2302 - 2305
[25] An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization
Vijayasenan, Deepu
Valente, Fabio
Bourlard, Herve
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (02): : 431 - 438
[26] Statistical Speaker Diarization Using Dependent Combination of Extracted Features
Almgotir-Kadhimi, Hasan
Woo, Lok
Dlay, Satnam
2015 THIRD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, MODELLING AND SIMULATION (AIMS 2015), 2015, : 291 - 296
[27] SPEAKER DIARIZATION THROUGH SPEAKER EMBEDDINGS
Rouvier, Mickael
Bousquet, Pierre-Michel
Favre, Benoit
2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2082 - 2086
[28] Multimodal Speaker Diarization
Noulas, Athanasios
Englebienne, Gwenn
Krose, Ben J. A.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (01) : 79 - 93
[29] SPEAKER DIARIZATION WITH LSTM
Wang, Quan
Downey, Carlton
Wan, Li
Mansfield, Philip Andrew
Moreno, Ignacio Lopez
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5239 - 5243
[30] Trainable Speaker Diarization
Aronowitz, Hagai
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2021 - 2024

← 1 2 3 4 5 →