Multi channel far field speaker verification using teacher student deep neural networks

被引:0
|
作者
Jung, Jee-weon [1 ]
Heo, Hee-Soo [1 ]
Shim, Hye-jin [1 ]
Yu, Ha-Jin [1 ]
机构
[1] Univ Seoul, Coll Engn, Sch Cmputer Sci, 163 Siripdae Ro, Seoul 02504, South Korea
来源
关键词
Teacher student learning; Deep neural networks; Far-distance speaker verification; Multi channel speaker verification;
D O I
10.7776/ASK.2018.37.6.483
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Far field input utterance is one of the major causes of performance degradation of speaker verification systems. In this study, we used teacher student learning framework to compensate for the performance degradation caused by far field utterances. Teacher student learning refers to training the student deep neural network in possible performance degradation condition using the teacher deep neural network trained without such condition. In this study, we use the teacher network trained with near distance utterances to train the student network with far distance utterances. However, through experiments, it was found that performance of near distance utterances were deteriorated. To avoid such phenomenon, we proposed techniques that use trained teacher network as initialization of student network and training the student network using both near and far field utterances. Experiments were conducted using deep neural networks that input raw waveforms of 4-channel utterances recorded in both near and far distance. Results show the equal error rate of near and far-field utterances respectively, 2.55 % / 2.8 % without teacher student learning, 9.75 % / 1.8 % for conventional teacher student learning, and 2.5 % / 2.7 % with proposed techniques.
引用
收藏
页码:483 / 488
页数:6
相关论文
共 50 条
  • [21] Speaker verification for security systems using artificial neural networks
    Vieira, K
    Wilamowski, B
    Kubichek, R
    IECON '97 - PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON INDUSTRIAL ELECTRONICS, CONTROL, AND INSTRUMENTATION, VOLS. 1-4, 1997, : 1102 - 1107
  • [22] Neural Networks based Channel Compensation for I-Vector Speaker Verification
    Rao, Wei
    Xiao, Xiong
    Xu, Chenglin
    Xu, Haihua
    Lee, Kong Aik
    Chng, Eng Siong
    Li, Haizhou
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [23] SPEAKER ADAPTIVE TRAINING USING DEEP NEURAL NETWORKS
    Ochiai, Tsubasa
    Matsuda, Shigeki
    Lu, Xugang
    Hori, Chiori
    Katagiri, Shigeru
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [24] DEEP NEURAL NETWORKS FOR SMALL FOOTPRINT TEXT-DEPENDENT SPEAKER VERIFICATION
    Variani, Ehsan
    Lei, Xin
    McDermott, Erik
    Moreno, Ignacio Lopez
    Gonzalez-Dominguez, Javier
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [25] SNR-Invariant Multitask Deep Neural Networks for Robust Speaker Verification
    Yao, Qi
    Mak, Man-Wai
    IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (11) : 1670 - 1674
  • [26] NPU Speaker Verification System for INTERSPEECH 2020 Far-Field Speaker Verification Challenge
    Zhang, Li
    Wu, Jian
    Xie, Lei
    INTERSPEECH 2020, 2020, : 3471 - 3475
  • [27] MULTI-LEVEL DEEP NEURAL NETWORK ADAPTATION FOR SPEAKER VERIFICATION USING MMD AND CONSISTENCY REGULARIZATION
    Lin, Weiwei
    Mak, Man-Mai
    Li, Na
    Su, Dan
    Yu, Dong
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6839 - 6843
  • [28] Robust Multi-Channel Far-Field Speaker Verification Under Different In-Domain Data Availability Scenarios
    Qin, Xiaoyi
    Cai, Danwei
    Li, Ming
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 71 - 85
  • [29] Multi-Channel Far-Field Speaker Verification with Large-Scale Ad-hoc Microphone Arrays
    Liang, Chengdong
    Chen, Yijiang
    Yao, Jiadi
    Zhang, Xiao-Lei
    INTERSPEECH 2022, 2022, : 3679 - 3683
  • [30] Speaker Diarization Using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings
    Cyrta, Pawel
    Trzcinski, Tomasz
    Stokowiec, Wojciech
    INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, PT I, 2018, 655 : 107 - 117