Locally-Connected and Convolutional Neural Networks for Small Footprint Speaker Recognition

被引：0

作者：

Chen, Yu-hsin ^{[1
]}

Lopez-Moreno, Ignacio ^{[1
]}

Sainath, Tara N. ^{[1
]}

Visontai, Mirko ^{[1
]}

Alvarez, Raziel ^{[1
]}

Parada, Carolina ^{[1
]}

机构：

[1] Google Inc, Mountain View, CA 94043 USA

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This work compares the performance of deep Locally Connected Networks (LCN) and Convolutional Neural Networks (CNN) for text-dependent speaker recognition. These topologies model the local time-frequency correlations of the speech signal better, using only a fraction of the number of parameters of a fully connected Deep Neural Network (DNN) used in previous works. We show that both a LCN and CNN can reduce the total model footprint to 30% of the original size compared to a baseline fully-connected DNN, with minimal impact in performance or latency. In addition, when matching parameters, the LCN improves speaker verification performance, as measured by equal error rate (EER), by 8% relative over the baseline without increasing model size or computation. Similarly, a CNN improves EER by 10% relative over the baseline for the same model size but with increased computation.

引用

页码：1136 / 1140

页数：5

共 50 条

[1] Speaker recognition using convolutional siamese neural networks
Jung H.
Yoon S.
Park N.
Transactions of the Korean Institute of Electrical Engineers, 2020, 69 (01): : 164 - 169
[2] Locally-Connected, Irregular Deep Neural Networks for Biomimetic Active Vision in a Simulated Human
Nakada, Masaki
Chen, Honglin
Lakshmipathy, Arjun
Terzopoulos, Demetri
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4465 - 4472
[3] Locally-Connected Viterbi Decoder Architectures and their VLSI Implementation for LDPC and Convolutional codes
Refaey, Ahmed
Roy, Sebastien
Laroche, Isabelle
Gosselin, Benoit
2013 ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2013, : 505 - 509
[4] Application of Convolutional Neural Networks to Speaker Recognition in Noisy Conditions
McLaren, Mitchell
Lei, Yun
Scheffer, Nicolas
Ferrer, Luciana
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 686 - 690
[5] Convolutional Neural Networks for Small-footprint Keyword Spotting
Sainath, Tara N.
Parada, Carolina
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1478 - 1482
[6] JOINT SPEAKER DIARIZATION AND RECOGNITION USING CONVOLUTIONAL AND RECURRENT NEURAL NETWORKS
Zhou, Zhihan
Zhang, Yichi
Duan, Zhiyao
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2496 - 2500
[7] Speaker Identification Using Partially Connected Locally Recurrent Probabilistic Neural Networks
Briciu, Petru-Marian
PROCEEDINGS OF THE 2010 8TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS (COMM), 2010, : 87 - 90
[8] Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech
Simic, Nikola
Suzic, Sinisa
Nosek, Tijana
Vujovic, Mia
Peric, Zoran
Savic, Milan
Delic, Vlado
ENTROPY, 2022, 24 (03)
[9] A deep learning approach to integrate convolutional neural networks in speaker recognition
Hourri, Soufiane
Nikolov, Nikola S.
Kharroubi, Jamal
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (03) : 615 - 623
[10] In-situ learning in multilayer locally-connected memristive spiking neural network
Li, Jiwei
Xu, Hui
Sun, Sheng-Yang
Li, Zhiwei
Li, Qingjiang
Liu, Haijun
Li, Nan
NEUROCOMPUTING, 2021, 463 : 251 - 264

← 1 2 3 4 5 →