Locally-Connected and Convolutional Neural Networks for Small Footprint Speaker Recognition

被引:0
|
作者
Chen, Yu-hsin [1 ]
Lopez-Moreno, Ignacio [1 ]
Sainath, Tara N. [1 ]
Visontai, Mirko [1 ]
Alvarez, Raziel [1 ]
Parada, Carolina [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This work compares the performance of deep Locally Connected Networks (LCN) and Convolutional Neural Networks (CNN) for text-dependent speaker recognition. These topologies model the local time-frequency correlations of the speech signal better, using only a fraction of the number of parameters of a fully connected Deep Neural Network (DNN) used in previous works. We show that both a LCN and CNN can reduce the total model footprint to 30% of the original size compared to a baseline fully-connected DNN, with minimal impact in performance or latency. In addition, when matching parameters, the LCN improves speaker verification performance, as measured by equal error rate (EER), by 8% relative over the baseline without increasing model size or computation. Similarly, a CNN improves EER by 10% relative over the baseline for the same model size but with increased computation.
引用
收藏
页码:1136 / 1140
页数:5
相关论文
共 50 条
  • [1] Speaker recognition using convolutional siamese neural networks
    Jung H.
    Yoon S.
    Park N.
    Transactions of the Korean Institute of Electrical Engineers, 2020, 69 (01): : 164 - 169
  • [2] Locally-Connected, Irregular Deep Neural Networks for Biomimetic Active Vision in a Simulated Human
    Nakada, Masaki
    Chen, Honglin
    Lakshmipathy, Arjun
    Terzopoulos, Demetri
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4465 - 4472
  • [3] Locally-Connected Viterbi Decoder Architectures and their VLSI Implementation for LDPC and Convolutional codes
    Refaey, Ahmed
    Roy, Sebastien
    Laroche, Isabelle
    Gosselin, Benoit
    2013 ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2013, : 505 - 509
  • [4] Application of Convolutional Neural Networks to Speaker Recognition in Noisy Conditions
    McLaren, Mitchell
    Lei, Yun
    Scheffer, Nicolas
    Ferrer, Luciana
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 686 - 690
  • [5] Convolutional Neural Networks for Small-footprint Keyword Spotting
    Sainath, Tara N.
    Parada, Carolina
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1478 - 1482
  • [6] JOINT SPEAKER DIARIZATION AND RECOGNITION USING CONVOLUTIONAL AND RECURRENT NEURAL NETWORKS
    Zhou, Zhihan
    Zhang, Yichi
    Duan, Zhiyao
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2496 - 2500
  • [7] Speaker Identification Using Partially Connected Locally Recurrent Probabilistic Neural Networks
    Briciu, Petru-Marian
    PROCEEDINGS OF THE 2010 8TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS (COMM), 2010, : 87 - 90
  • [8] Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech
    Simic, Nikola
    Suzic, Sinisa
    Nosek, Tijana
    Vujovic, Mia
    Peric, Zoran
    Savic, Milan
    Delic, Vlado
    ENTROPY, 2022, 24 (03)
  • [9] A deep learning approach to integrate convolutional neural networks in speaker recognition
    Hourri, Soufiane
    Nikolov, Nikola S.
    Kharroubi, Jamal
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (03) : 615 - 623
  • [10] In-situ learning in multilayer locally-connected memristive spiking neural network
    Li, Jiwei
    Xu, Hui
    Sun, Sheng-Yang
    Li, Zhiwei
    Li, Qingjiang
    Liu, Haijun
    Li, Nan
    NEUROCOMPUTING, 2021, 463 : 251 - 264