GMM-derived features for effective unsupervised adaptation of deep neural network acoustic models

被引:0
|
作者
Tomashenko, Natalia [1 ,2 ]
Khokhlov, Yuri [3 ]
机构
[1] Speech Technol Ctr, St Petersburg, Russia
[2] ITMO Univ, St Petersburg, Russia
[3] STC Innovat Ltd, St Petersburg, Russia
关键词
speaker adaptation; deep neural networks (DNN); MAP; fMLLR; CD-DNN-HMM; GMM-derived (GMMD) features; speaker adaptive training (SAT);
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we investigate GMM-derived features recently introduced for adaptation of context-dependent deep neural network HMM (CD-DNN-HMM) acoustic models. We improve the previously proposed adaptation algorithm by applying the concept of speaker adaptive training (SAT) to DNNs built on GMM-derived features and by using fMLLR-adapted features for training an auxiliary GMM model. Traditional adaptation algorithms, such as maximum a posteriori adaptation (MAP) and feature space maximum likelihood linear regression (fMLLR) are performed for the auxiliary GMM model used in a SAT procedure for a DNN. Experimental results on the Wall Street Journal (WSJ0) corpus show that the proposed adaptation technique can provide, on average, a 17-28% relative word error rate (WER) reduction on different adaptation sets under an unsupervised adaptation setup, compared to speaker independent (SI) DNN-HMM systems built on conventional features. We found that fMLLR adaptation for the SAT DNN trained on GMM-derived features outperforms fMLLR adaptation for the SAT DNN trained on conventional features by up to 14% of relative WER reduction.
引用
收藏
页码:2882 / 2886
页数:5
相关论文
共 50 条
  • [41] Efficient Implementation of the Room Simulator for Training Deep Neural Network Acoustic Models
    Kim, Chanwoo
    Variani, Ehsan
    Narayanan, Arun
    Bacchiani, Michiel
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3028 - 3032
  • [42] IMPROVING DEEP NEURAL NETWORK ACOUSTIC MODELS USING GENERALIZED MAXOUT NETWORKS
    Zhang, Xiaohui
    Trmal, Jan
    Povey, Daniel
    Khudanpur, Sanjeev
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [43] NEURAL NETWORK MODELS OF LEARNING AND ADAPTATION
    DENKER, JS
    PHYSICA D-NONLINEAR PHENOMENA, 1986, 22 (1-3) : 216 - 232
  • [44] Domain-invariant representation learning using an unsupervised domain adversarial adaptation deep neural network
    Jia, Xibin
    Jin, Ya
    Su, Xing
    Hu, Yongli
    NEUROCOMPUTING, 2019, 355 : 209 - 220
  • [45] A novel unsupervised domain adaptation based on deep neural network and manifold regularization for mechanical fault diagnosis
    Zhang, Zhongwei
    Chen, Huaihai
    Li, Shunming
    An, Zenghui
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2020, 31 (08)
  • [46] An unsupervised adaptation method for deep neural network-based large vocabulary continuous speech recognition
    Xiao, Yeming
    Si, Yujing
    Xu, Ji
    Pan, Jielin
    Yan, Yonghong
    Journal of Information and Computational Science, 2014, 11 (14): : 4889 - 4899
  • [47] Deep Ladder-Suppression Network for Unsupervised Domain Adaptation
    Deng, Wanxia
    Zhao, Lingjun
    Kuang, Gangyao
    Hu, Dewen
    Pietikainen, Matti
    Liu, Li
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (10) : 10735 - 10749
  • [48] Unsupervised classification based on deep adaptation network for sonar images
    Xu, Huipu
    Yang, Linzhen
    Zhang, Meixiang
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (01)
  • [49] Unsupervised Video Hashing via Deep Neural Network
    Ma, Chao
    Gu, Yun
    Gong, Chen
    Yang, Jie
    Feng, Deying
    NEURAL PROCESSING LETTERS, 2018, 47 (03) : 877 - 890
  • [50] An Unsupervised Spiking Deep Neural Network for Object Recognition
    Song, Zeyang
    Wu, Xi
    Yuan, Mengwen
    Tang, Huajin
    ADVANCES IN NEURAL NETWORKS - ISNN 2019, PT II, 2019, 11555 : 361 - 370