Locally-Connected and Convolutional Neural Networks for Small Footprint Speaker Recognition

被引:0
|
作者
Chen, Yu-hsin [1 ]
Lopez-Moreno, Ignacio [1 ]
Sainath, Tara N. [1 ]
Visontai, Mirko [1 ]
Alvarez, Raziel [1 ]
Parada, Carolina [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This work compares the performance of deep Locally Connected Networks (LCN) and Convolutional Neural Networks (CNN) for text-dependent speaker recognition. These topologies model the local time-frequency correlations of the speech signal better, using only a fraction of the number of parameters of a fully connected Deep Neural Network (DNN) used in previous works. We show that both a LCN and CNN can reduce the total model footprint to 30% of the original size compared to a baseline fully-connected DNN, with minimal impact in performance or latency. In addition, when matching parameters, the LCN improves speaker verification performance, as measured by equal error rate (EER), by 8% relative over the baseline without increasing model size or computation. Similarly, a CNN improves EER by 10% relative over the baseline for the same model size but with increased computation.
引用
收藏
页码:1136 / 1140
页数:5
相关论文
共 50 条
  • [31] Speaker Diarization Using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings
    Cyrta, Pawel
    Trzcinski, Tomasz
    Stokowiec, Wojciech
    INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, PT I, 2018, 655 : 107 - 117
  • [32] Convolutional Neural Networks for Phoneme Recognition
    Glackin, Cornelius
    Wall, Julie
    Chollet, Gerard
    Dugan, Nazim
    Cannings, Nigel
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM 2018), 2018, : 190 - 195
  • [33] Robust speaker recognition method based on convolutional neural network
    Zeng C.
    Ma C.
    Wang Z.
    Kong X.
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2020, 48 (06): : 39 - 44
  • [34] Convolutional Neural Networks for Speech Recognition
    Abdel-Hamid, Ossama
    Mohamed, Abdel-Rahman
    Jiang, Hui
    Deng, Li
    Penn, Gerald
    Yu, Dong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1533 - 1545
  • [35] Convolutional neural networks for face recognition
    Lawrence, S
    Giles, CL
    Tsoi, AC
    1996 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1996, : 217 - 222
  • [36] Language recognition by convolutional neural networks
    Pour, L. Khosravani
    Farrokhi, A.
    SCIENTIA IRANICA, 2023, 30 (01) : 116 - 123
  • [37] Reduced Model Size Deep Convolutional Neural Networks for Small-Footprint Keyword Spotting
    Tsai, Tsung Han
    Lin, Xin Hui
    2021 28TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS, AND SYSTEMS (IEEE ICECS 2021), 2021,
  • [38] Sub-band Convolutional Neural Networks for Small-footprint Spoken Term Classification
    Kao, Chieh-Chi
    Sun, Ming
    Gao, Yixin
    Vitaladevuni, Shiv
    Wang, Chao
    INTERSPEECH 2019, 2019, : 2195 - 2199
  • [39] Cancellable template generation for speaker recognition based on spectrogram patch selection and deep convolutional neural networks
    El-Moneim, Samia A.
    Nassar, M.A.
    Dessouky, Moawad I.
    Ismail, Nabil A.
    El-Fishawy, Adel S.
    El-Samie, Fathi E. Abd
    International Journal of Speech Technology, 2022, 25 (03) : 689 - 696
  • [40] Face Recognition Based on Densely Connected Convolutional Networks
    Zhang, Tong
    Wang, Rong
    Ding, Jianwei
    Li, Xin
    Li, Bo
    2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,