Extending the Task of Diarization to Speaker Attribution

被引：0

作者：

Ghaemmaghami, Houman ^{[1
]}

Dean, David ^{[1
]}

Vogt, Robbie ^{[1
]}

Sridharan, Sridha ^{[1
]}

机构：

[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld 4001, Australia

来源：

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年

关键词：

speaker attribution; diarization; clustering; cross likelihood ratio; joint factor analysis;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we extend the concept of speaker annotation within a single-recording, or speaker diarization, to a collection wide approach we call speaker attribution. Accordingly, speaker attribution is the task of clustering expectantly homogenous intersession clusters obtained using diarization according to common cross-recording identities. The result of attribution is a collection of spoken audio across multiple recordings attributed to speaker identities. In this paper, an attribution system is proposed using mean-only MAP adaptation of a combined-gender UBM to model clusters from a perfect diarization system, as well as a JFA-based system with session variability compensation. The normalized cross-likelihood ratio is calculated for each pair of clusters to construct an attribution matrix and the complete linkage algorithm is employed to conduct clustering of the inter-session clusters. A matched cluster purity and coverage of 87.1% was obtained on the NIST 2008 SRE corpus.

引用

页码：1056 / 1059

页数：4

共 50 条

[41] SPEAKER DIARIZATION AND LINKING OF LARGE CORPORA
Ferras, Marc
Bourlard, Herve
2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 280 - 285
[42] SPEAKER DIARIZATION WITH UNSUPERVISED TRAINING FRAMEWORKL
Le Lan, Gael
Meignier, Sylvain
Charlet, Delphine
Deleglise, Paul
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5560 - 5564
[43] Self-supervised Speaker Diarization
Dissen, Yehoshua
Kreuk, Felix
Keshet, Joseph
INTERSPEECH 2022, 2022, : 4013 - 4017
[44] Spectral Clustering Approach to Speaker Diarization
Ning, Huazhong
Liu, Ming
Tang, Hao
Huang, Thomas
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2178 - 2181
[45] Speaker Diarization for Meeting Room Audio
Sun, Hanwu
Nwe, Tin Lay
Ma, Bin
Li, Haizhou
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 888 - 891
[46] Robust Speaker Diarization for News Broadcast
Karthik, M. L. N. S.
Ganesh, Mirishkar Sai
Patnaik, Bijayananda
2018 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2018,
[47] IMPROVED SPEAKER DIARIZATION SYSTEM FOR MEETINGS
El-Khoury, Elie
Senac, Christine
Pinquier, Julien
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4097 - 4100
[48] Priors for Speaker Counting and Diarization with AHC
Sell, Gregory
McCree, Alan
Garcia-Romero, Daniel
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2194 - 2198
[49] Triplet Network with Attention for Speaker Diarization
Song, Huan
Willi, Megan
Thiagarajan, Jayaraman J.
Berisha, Visar
Spanias, Andreas
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3608 - 3612
[50] pnf Improvements in speaker diarization system
Fu, Rong
Benest, Ian D.
SIGMAP 2007: PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS, 2007, : 317 - +

← 1 2 3 4 5 →