Locally-Connected and Convolutional Neural Networks for Small Footprint Speaker Recognition

被引：0

作者：

Chen, Yu-hsin ^{[1
]}

Lopez-Moreno, Ignacio ^{[1
]}

Sainath, Tara N. ^{[1
]}

Visontai, Mirko ^{[1
]}

Alvarez, Raziel ^{[1
]}

Parada, Carolina ^{[1
]}

机构：

[1] Google Inc, Mountain View, CA 94043 USA

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This work compares the performance of deep Locally Connected Networks (LCN) and Convolutional Neural Networks (CNN) for text-dependent speaker recognition. These topologies model the local time-frequency correlations of the speech signal better, using only a fraction of the number of parameters of a fully connected Deep Neural Network (DNN) used in previous works. We show that both a LCN and CNN can reduce the total model footprint to 30% of the original size compared to a baseline fully-connected DNN, with minimal impact in performance or latency. In addition, when matching parameters, the LCN improves speaker verification performance, as measured by equal error rate (EER), by 8% relative over the baseline without increasing model size or computation. Similarly, a CNN improves EER by 10% relative over the baseline for the same model size but with increased computation.

引用

页码：1136 / 1140

页数：5

共 50 条

[41] Cancellable template generation for speaker recognition based on spectrogram patch selection and deep convolutional neural networks
El-Moneim S.A.
Nassar M.A.
Dessouky M.I.
Ismail N.A.
El-Fishawy A.S.
El-Samie F.E.A.
International Journal of Speech Technology, 2022, 25 (3) : 689 - 696
[42] BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Convolutional Neural Networks
Sufian, Abu
Ghosh, Anirudha
Naskar, Avijit
Sultana, Farhana
Sil, Jaya
Rahman, M. M. Hafizur
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (06) : 2610 - 2620
[43] SMALL-FOOTPRINT CONVOLUTIONAL NEURAL NETWORK FOR SPOOFING DETECTION
Dinkel, Heinrich
Qian, Yanmin
Yu, Kai
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 3086 - 3091
[44] An Analytical Comparison of Locally-Connected Reconfigurable Neural Network Architectures Using a C. elegans Locomotive Model
Graham-Harper-Cater, Jonathan
Metcalfe, Benjamin
Wilson, Peter
COMPUTERS, 2018, 7 (03)
[45] PROTOTYPICAL NETWORKS FOR SMALL FOOTPRINT TEXT-INDEPENDENT SPEAKER VERIFICATION
Ko, Tom
Chen, Yangbin
Li, Qing
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6804 - 6808
[46] Speaker Recognition Based on MFCC and BP Neural Networks
Wang, Yi
Lawlor, Bob
2017 28TH IRISH SIGNALS AND SYSTEMS CONFERENCE (ISSC), 2017,
[47] Speaker Recognition Using Neural Networks and Conventional Classifiers
Farrell, Kevin R.
Mammone, Richard J.
Assaleh, Khaled T.
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01): : 194 - 205
[48] Deep Speaker Embeddings with Convolutional Neural Network on Supervector for Text-Independent Speaker Recognition
Cai, Danwei
Cai, Zexin
Li, Ming
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1478 - 1482
[49] AN APPLICATION OF SPEAKER RECOGNITION USING ARTIFICIAL NEURAL NETWORKS
Caner, Murat
Ustun, Seydi Vakkas
PAMUKKALE UNIVERSITY JOURNAL OF ENGINEERING SCIENCES-PAMUKKALE UNIVERSITESI MUHENDISLIK BILIMLERI DERGISI, 2006, 12 (02): : 279 - 284
[50] Speaker recognition using pulse coupled neural networks
Timoszczuk, Antonio Pedro
Cabral, Euvaldo F., Jr.
2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 1965 - +

← 1 2 3 4 5 →