Multi channel far field speaker verification using teacher student deep neural networks

被引:0
|
作者
Jung, Jee-weon [1 ]
Heo, Hee-Soo [1 ]
Shim, Hye-jin [1 ]
Yu, Ha-Jin [1 ]
机构
[1] Univ Seoul, Coll Engn, Sch Cmputer Sci, 163 Siripdae Ro, Seoul 02504, South Korea
来源
关键词
Teacher student learning; Deep neural networks; Far-distance speaker verification; Multi channel speaker verification;
D O I
10.7776/ASK.2018.37.6.483
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Far field input utterance is one of the major causes of performance degradation of speaker verification systems. In this study, we used teacher student learning framework to compensate for the performance degradation caused by far field utterances. Teacher student learning refers to training the student deep neural network in possible performance degradation condition using the teacher deep neural network trained without such condition. In this study, we use the teacher network trained with near distance utterances to train the student network with far distance utterances. However, through experiments, it was found that performance of near distance utterances were deteriorated. To avoid such phenomenon, we proposed techniques that use trained teacher network as initialization of student network and training the student network using both near and far field utterances. Experiments were conducted using deep neural networks that input raw waveforms of 4-channel utterances recorded in both near and far distance. Results show the equal error rate of near and far-field utterances respectively, 2.55 % / 2.8 % without teacher student learning, 9.75 % / 1.8 % for conventional teacher student learning, and 2.5 % / 2.7 % with proposed techniques.
引用
收藏
页码:483 / 488
页数:6
相关论文
共 50 条
  • [1] MODELLING SPEAKER AND CHANNEL VARIABILITY USING DEEP NEURAL NETWORKS FOR ROBUST SPEAKER VERIFICATION
    Bhattacharya, Gautam
    Alam, Jahangir
    Kenny, Patrick
    Gupta, Vishwa
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 192 - 198
  • [2] PARAMETERIZED CHANNEL NORMALIZATION FOR FAR-FIELD DEEP SPEAKER VERIFICATION
    Liu, Xuechen
    Sahidullah, Md
    Kinnunen, Tomi
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 1132 - 1138
  • [3] Channel adaptation based on deep neural networks for speaker verification
    Long Y.
    Ni J.
    Ye H.
    2016, Sichuan University (48): : 151 - 155
  • [4] MULTISV: DATASET FOR FAR-FIELD MULTI-CHANNEL SPEAKER VERIFICATION
    Mosner, Ladislav
    Plchot, Oldrich
    Burget, Lukas
    Cernocky, Jan ''Honza''
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7977 - 7981
  • [5] Channel Interdependence Enhanced Speaker Embeddings for Far-Field Speaker Verification
    Zhao, Ling-jun
    Mak, Man-Wai
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [6] Multi-task deep cross-attention networks for far-field speaker verification and keyword spotting
    Xingwei Liang
    Zehua Zhang
    Ruifeng Xu
    EURASIP Journal on Audio, Speech, and Music Processing, 2023
  • [7] Multi-task deep cross-attention networks for far-field speaker verification and keyword spotting
    Liang, Xingwei
    Zhang, Zehua
    Xu, Ruifeng
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [8] Utilization of age information for speaker verification using multi-task learning deep neural networks
    Kim, Ju-ho
    Heo, Hee-Soo
    Jung, Jee-weon
    Shim, Hye-jin
    Kim, Seung-Bin
    Yu, Ha-Jin
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2019, 38 (05): : 593 - 600
  • [9] ASVtorch toolkit: Speaker verification with deep neural networks
    Lee, Kong Aik
    Vestman, Ville
    Kinnunen, Tomi
    SOFTWAREX, 2021, 14
  • [10] Speaker verification using committee neural networks
    Reddy, NP
    Butch, OA
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2003, 72 (02) : 109 - 115