Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion

被引：0

作者：

Nishida, M ^{[1
]}

Kawahara, T ^{[1
]}

机构：

[1] JST, PRESTO, Sakyo Ku, Kyoto 6068501, Japan

来源：

2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I | 2003年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper addresses unsupervised speaker indexing for discussion Audio archives. In discussions, the speaker changes frequently, thus the duration of utterances is very short and its variation is large, which causes significant problems in applying conventional methods such as model adaptation and Variance-BIC (Bayesian Information Criterion) methods. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the BIC according to the duration of utterances. When the speech segment is short, the simple and robust VQ-based method is expected to be chosen, while GMM will be reliably trained for long segments. For a discussion archive having a total duration of 10 hours; it is demonstrated that the proposed method achieves higher indexing performance than that of conventional methods.

引用

页码：172 / 175

页数：4

共 50 条

[41] Unsupervised intra-speaker variability compensation based on Gestalt and model adaptation in speaker verification with telephone speech
Yoma, Nestor Becerra
Garreton, Claudio
Molina, Carlos
Huenupan, Fernando
SPEECH COMMUNICATION, 2008, 50 (11-12) : 953 - 964
[42] SPEAKER CHANGE DETECTION AND SPEAKER DIARIZATION USING SPATIAL INFORMATION
Hu, Mathieu
Sharma, Dushyant
Doclo, Simon
Brookes, Mike
Naylor, Patrick A.
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5743 - 5747
[43] Speaker Diarization and Detection System using A Priori Speaker Information
Kenai, Ouassila
Asbai, Nassim
Ouamour, Siham
Guerti, Mhania
Djeghiour, Salim
2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING (ICNLSP), 2018, : 73 - 78
[44] AN UNSUPERVISED AUDIO SEGMENTATION METHOD USING BAYESIAN INFORMATION CRITERION
Ozan, Ezgi Can
Tankiz, Seda
Acar, Banu Oskay
Ciloglu, Tolga
2014 6TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS, CONTROL AND SIGNAL PROCESSING (ISCCSP), 2014, : 640 - 643
[45] Bayesian Speaker Adaptation Based on a New Hierarchical Probabilistic Model
Zhang, Wen-Lin
Zhang, Wei-Qiang
Li, Bi-Cheng
Qu, Dan
Johnson, Michael T.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (07): : 2002 - 2015
[46] Confidence measure based unsupervised target model adaptation for speaker verification
Preti, A.
Bonastre, J. -F
Matrouf, D.
Capman, F.
Ravera, B.
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1729 - +
[47] A Bayesian Information Criterion for Unsupervised Learning Based on an Objective Prior
Baimuratov, Ildar
Shichkina, Yulia
Stankova, Elena
Zhukova, Nataly
Nguyen Than
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2019, PT I: 19TH INTERNATIONAL CONFERENCE, SAINT PETERSBURG, RUSSIA, JULY 1-4, 2019, PROCEEDINGS, PT I, 2019, 11619 : 707 - 716
[48] Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification
Shum, Stephen
Dehak, Najim
Dehak, Reda
Glass, James R.
ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 76 - 82
[49] Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition
Kosaka, Tetsuo
Takeda, Yuui
Ito, Takashi
Kato, Masaharu
Kohda, Masaki
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2363 - 2369
[50] Information based speaker verification
Pham, T
Wagner, M
15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS: IMAGE, SPEECH AND SIGNAL PROCESSING, 2000, : 278 - 281

← 1 2 3 4 5 →