IMPROVING SPEAKER RECOGNITION PERFORMANCE IN THE DOMAIN ADAPTATION CHALLENGE USING DEEP NEURAL NETWORKS

被引:0
|
作者
Garcia-Romero, Daniel [1 ]
Zhang, Xiaohui
McCree, Alan
Povey, Daniel
机构
[1] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
关键词
Unsupervised adaptation; speaker recognition; i-vectors; deep neural networks;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional i-vector speaker recognition systems use a Gaussian mixture model (GMM) to collect sufficient statistics (SS). Recently, replacing this GMM with a deep neural network (DNN) has shown promising results. In this paper, we explore the use of DNNs to collect SS for the unsupervised domain adaptation task of the Domain Adaptation Challenge (DAC). We show that collecting SS with a DNN trained on out-of-domain data boosts the speaker recognition performance of an out-of-domain system by more than 25%. Moreover, we integrate the DNN in an unsupervised adaptation framework, that uses agglomerative hierarchical clustering with a stopping criterion based on unsupervised calibration, and show that the initial gains of the out-of-domain system carry over to the final adapted system. Despite the fact that the DNN is trained on the out-of-domain data, the final adapted system produces a relative improvement of more than 30% with respect to the best published results on this task.
引用
收藏
页码:378 / 383
页数:6
相关论文
共 50 条
  • [21] Speaker2Vec: Unsupervised Learning and Adaptation of a Speaker Manifold using Deep Neural Networks with an Evaluation on Speaker Segmentation
    Jati, Arindam
    Georgiou, Panayiotis
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3567 - 3571
  • [22] Fast speaker adaptation of artificial neural networks for automatic speech recognition
    Dupont, S
    Cheboub, L
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1795 - 1798
  • [23] OliVaR: Improving olive variety recognition using deep neural networks
    Miho, Hristofor
    Pagnotta, Giulio
    De Gaspari, Fabio
    Hitaj, Dorjan
    Mancini, Luigi Vincenzo
    Koubouris, Georgios
    Godino, Gianluca
    Hakan, Mehmet
    Diez, Concepcion Munoz
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 216
  • [24] Improving the Generalization Ability of Deep Neural Networks for Cross-Domain Visual Recognition
    Zheng, Jianwei
    Lu, Chao
    Hao, Cong
    Chen, Deming
    Guo, Donghui
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 13 (03) : 607 - 620
  • [25] Approaches for Out-of-Domain Adaptation to Improve Speaker Recognition Performance
    Shulipa, Andrey
    Novoselov, Sergey
    Melnikov, Aleksandr
    SPEECH AND COMPUTER, 2016, 9811 : 124 - 130
  • [26] Speaker Recognition Using Neural Networks and Conventional Classifiers
    Farrell, Kevin R.
    Mammone, Richard J.
    Assaleh, Khaled T.
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01): : 194 - 205
  • [27] AN APPLICATION OF SPEAKER RECOGNITION USING ARTIFICIAL NEURAL NETWORKS
    Caner, Murat
    Ustun, Seydi Vakkas
    PAMUKKALE UNIVERSITY JOURNAL OF ENGINEERING SCIENCES-PAMUKKALE UNIVERSITESI MUHENDISLIK BILIMLERI DERGISI, 2006, 12 (02): : 279 - 284
  • [28] Speaker recognition using convolutional siamese neural networks
    Jung H.
    Yoon S.
    Park N.
    Transactions of the Korean Institute of Electrical Engineers, 2020, 69 (01): : 164 - 169
  • [29] Speaker recognition using pulse coupled neural networks
    Timoszczuk, Antonio Pedro
    Cabral, Euvaldo F., Jr.
    2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 1965 - +
  • [30] Exploring the performance of automatic speaker recognition using twin speech and deep learning-based artificial neural networks
    Cavalcanti, Julio Cesar
    da Silva, Ronaldo Rodrigues
    Eriksson, Anders
    Barbosa, Plinio A.
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7