Locally-Connected and Convolutional Neural Networks for Small Footprint Speaker Recognition

被引:0
|
作者
Chen, Yu-hsin [1 ]
Lopez-Moreno, Ignacio [1 ]
Sainath, Tara N. [1 ]
Visontai, Mirko [1 ]
Alvarez, Raziel [1 ]
Parada, Carolina [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This work compares the performance of deep Locally Connected Networks (LCN) and Convolutional Neural Networks (CNN) for text-dependent speaker recognition. These topologies model the local time-frequency correlations of the speech signal better, using only a fraction of the number of parameters of a fully connected Deep Neural Network (DNN) used in previous works. We show that both a LCN and CNN can reduce the total model footprint to 30% of the original size compared to a baseline fully-connected DNN, with minimal impact in performance or latency. In addition, when matching parameters, the LCN improves speaker verification performance, as measured by equal error rate (EER), by 8% relative over the baseline without increasing model size or computation. Similarly, a CNN improves EER by 10% relative over the baseline for the same model size but with increased computation.
引用
收藏
页码:1136 / 1140
页数:5
相关论文
共 50 条
  • [41] Cancellable template generation for speaker recognition based on spectrogram patch selection and deep convolutional neural networks
    El-Moneim S.A.
    Nassar M.A.
    Dessouky M.I.
    Ismail N.A.
    El-Fishawy A.S.
    El-Samie F.E.A.
    International Journal of Speech Technology, 2022, 25 (3) : 689 - 696
  • [42] BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Convolutional Neural Networks
    Sufian, Abu
    Ghosh, Anirudha
    Naskar, Avijit
    Sultana, Farhana
    Sil, Jaya
    Rahman, M. M. Hafizur
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (06) : 2610 - 2620
  • [43] SMALL-FOOTPRINT CONVOLUTIONAL NEURAL NETWORK FOR SPOOFING DETECTION
    Dinkel, Heinrich
    Qian, Yanmin
    Yu, Kai
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 3086 - 3091
  • [44] An Analytical Comparison of Locally-Connected Reconfigurable Neural Network Architectures Using a C. elegans Locomotive Model
    Graham-Harper-Cater, Jonathan
    Metcalfe, Benjamin
    Wilson, Peter
    COMPUTERS, 2018, 7 (03)
  • [45] PROTOTYPICAL NETWORKS FOR SMALL FOOTPRINT TEXT-INDEPENDENT SPEAKER VERIFICATION
    Ko, Tom
    Chen, Yangbin
    Li, Qing
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6804 - 6808
  • [46] Speaker Recognition Based on MFCC and BP Neural Networks
    Wang, Yi
    Lawlor, Bob
    2017 28TH IRISH SIGNALS AND SYSTEMS CONFERENCE (ISSC), 2017,
  • [47] Speaker Recognition Using Neural Networks and Conventional Classifiers
    Farrell, Kevin R.
    Mammone, Richard J.
    Assaleh, Khaled T.
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01): : 194 - 205
  • [48] Deep Speaker Embeddings with Convolutional Neural Network on Supervector for Text-Independent Speaker Recognition
    Cai, Danwei
    Cai, Zexin
    Li, Ming
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1478 - 1482
  • [49] AN APPLICATION OF SPEAKER RECOGNITION USING ARTIFICIAL NEURAL NETWORKS
    Caner, Murat
    Ustun, Seydi Vakkas
    PAMUKKALE UNIVERSITY JOURNAL OF ENGINEERING SCIENCES-PAMUKKALE UNIVERSITESI MUHENDISLIK BILIMLERI DERGISI, 2006, 12 (02): : 279 - 284
  • [50] Speaker recognition using pulse coupled neural networks
    Timoszczuk, Antonio Pedro
    Cabral, Euvaldo F., Jr.
    2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 1965 - +