Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion

被引:0
|
作者
Nishida, M [1 ]
Kawahara, T [1 ]
机构
[1] JST, PRESTO, Sakyo Ku, Kyoto 6068501, Japan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper addresses unsupervised speaker indexing for discussion Audio archives. In discussions, the speaker changes frequently, thus the duration of utterances is very short and its variation is large, which causes significant problems in applying conventional methods such as model adaptation and Variance-BIC (Bayesian Information Criterion) methods. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the BIC according to the duration of utterances. When the speech segment is short, the simple and robust VQ-based method is expected to be chosen, while GMM will be reliably trained for long segments. For a discussion archive having a total duration of 10 hours; it is demonstrated that the proposed method achieves higher indexing performance than that of conventional methods.
引用
收藏
页码:172 / 175
页数:4
相关论文
共 50 条
  • [41] Unsupervised intra-speaker variability compensation based on Gestalt and model adaptation in speaker verification with telephone speech
    Yoma, Nestor Becerra
    Garreton, Claudio
    Molina, Carlos
    Huenupan, Fernando
    SPEECH COMMUNICATION, 2008, 50 (11-12) : 953 - 964
  • [42] SPEAKER CHANGE DETECTION AND SPEAKER DIARIZATION USING SPATIAL INFORMATION
    Hu, Mathieu
    Sharma, Dushyant
    Doclo, Simon
    Brookes, Mike
    Naylor, Patrick A.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5743 - 5747
  • [43] Speaker Diarization and Detection System using A Priori Speaker Information
    Kenai, Ouassila
    Asbai, Nassim
    Ouamour, Siham
    Guerti, Mhania
    Djeghiour, Salim
    2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING (ICNLSP), 2018, : 73 - 78
  • [44] AN UNSUPERVISED AUDIO SEGMENTATION METHOD USING BAYESIAN INFORMATION CRITERION
    Ozan, Ezgi Can
    Tankiz, Seda
    Acar, Banu Oskay
    Ciloglu, Tolga
    2014 6TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS, CONTROL AND SIGNAL PROCESSING (ISCCSP), 2014, : 640 - 643
  • [45] Bayesian Speaker Adaptation Based on a New Hierarchical Probabilistic Model
    Zhang, Wen-Lin
    Zhang, Wei-Qiang
    Li, Bi-Cheng
    Qu, Dan
    Johnson, Michael T.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (07): : 2002 - 2015
  • [46] Confidence measure based unsupervised target model adaptation for speaker verification
    Preti, A.
    Bonastre, J. -F
    Matrouf, D.
    Capman, F.
    Ravera, B.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1729 - +
  • [47] A Bayesian Information Criterion for Unsupervised Learning Based on an Objective Prior
    Baimuratov, Ildar
    Shichkina, Yulia
    Stankova, Elena
    Zhukova, Nataly
    Nguyen Than
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2019, PT I: 19TH INTERNATIONAL CONFERENCE, SAINT PETERSBURG, RUSSIA, JULY 1-4, 2019, PROCEEDINGS, PT I, 2019, 11619 : 707 - 716
  • [48] Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification
    Shum, Stephen
    Dehak, Najim
    Dehak, Reda
    Glass, James R.
    ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 76 - 82
  • [49] Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition
    Kosaka, Tetsuo
    Takeda, Yuui
    Ito, Takashi
    Kato, Masaharu
    Kohda, Masaki
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2363 - 2369
  • [50] Information based speaker verification
    Pham, T
    Wagner, M
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS: IMAGE, SPEECH AND SIGNAL PROCESSING, 2000, : 278 - 281