Locally-Connected and Convolutional Neural Networks for Small Footprint Speaker Recognition

被引：0

作者：

Chen, Yu-hsin ^{[1
]}

Lopez-Moreno, Ignacio ^{[1
]}

Sainath, Tara N. ^{[1
]}

Visontai, Mirko ^{[1
]}

Alvarez, Raziel ^{[1
]}

Parada, Carolina ^{[1
]}

机构：

[1] Google Inc, Mountain View, CA 94043 USA

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This work compares the performance of deep Locally Connected Networks (LCN) and Convolutional Neural Networks (CNN) for text-dependent speaker recognition. These topologies model the local time-frequency correlations of the speech signal better, using only a fraction of the number of parameters of a fully connected Deep Neural Network (DNN) used in previous works. We show that both a LCN and CNN can reduce the total model footprint to 30% of the original size compared to a baseline fully-connected DNN, with minimal impact in performance or latency. In addition, when matching parameters, the LCN improves speaker verification performance, as measured by equal error rate (EER), by 8% relative over the baseline without increasing model size or computation. Similarly, a CNN improves EER by 10% relative over the baseline for the same model size but with increased computation.

引用

页码：1136 / 1140

页数：5

共 50 条

[31] Speaker Diarization Using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings
Cyrta, Pawel
Trzcinski, Tomasz
Stokowiec, Wojciech
INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, PT I, 2018, 655 : 107 - 117
[32] Convolutional Neural Networks for Phoneme Recognition
Glackin, Cornelius
Wall, Julie
Chollet, Gerard
Dugan, Nazim
Cannings, Nigel
PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM 2018), 2018, : 190 - 195
[33] Robust speaker recognition method based on convolutional neural network
Zeng C.
Ma C.
Wang Z.
Kong X.
Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2020, 48 (06): : 39 - 44
[34] Convolutional Neural Networks for Speech Recognition
Abdel-Hamid, Ossama
Mohamed, Abdel-Rahman
Jiang, Hui
Deng, Li
Penn, Gerald
Yu, Dong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1533 - 1545
[35] Convolutional neural networks for face recognition
Lawrence, S
Giles, CL
Tsoi, AC
1996 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1996, : 217 - 222
[36] Language recognition by convolutional neural networks
Pour, L. Khosravani
Farrokhi, A.
SCIENTIA IRANICA, 2023, 30 (01) : 116 - 123
[37] Reduced Model Size Deep Convolutional Neural Networks for Small-Footprint Keyword Spotting
Tsai, Tsung Han
Lin, Xin Hui
2021 28TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS, AND SYSTEMS (IEEE ICECS 2021), 2021,
[38] Sub-band Convolutional Neural Networks for Small-footprint Spoken Term Classification
Kao, Chieh-Chi
Sun, Ming
Gao, Yixin
Vitaladevuni, Shiv
Wang, Chao
INTERSPEECH 2019, 2019, : 2195 - 2199
[39] Cancellable template generation for speaker recognition based on spectrogram patch selection and deep convolutional neural networks
El-Moneim, Samia A.
Nassar, M.A.
Dessouky, Moawad I.
Ismail, Nabil A.
El-Fishawy, Adel S.
El-Samie, Fathi E. Abd
International Journal of Speech Technology, 2022, 25 (03) : 689 - 696
[40] Face Recognition Based on Densely Connected Convolutional Networks
Zhang, Tong
Wang, Rong
Ding, Jianwei
Li, Xin
Li, Bo
2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,

← 1 2 3 4 5 →