A focus module-based lightweight end-to-end CNN framework for voiceprint recognition

被引：9

作者：

Velayuthapandian, Karthikeyan ^{[1
]}

Subramoniam, Suja Priyadharsini ^{[2
]}

机构：

[1] Mepco Schlenk Engn Coll, Dept Elect & Commun Engn, Sivakasi, Tamil Nadu, India

[2] Anna Univ Reg Campus, Dept Elect & Commun Engn, Tirunelveli, Tamil Nadu, India

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2023年 / 17卷 / 06期

关键词：

Speaker recognition; Deep neural network; Spectrogram; 1-D CNN; Focus module; SUPPORT VECTOR MACHINES; SPEAKER; SYSTEM;

D O I：

10.1007/s11760-023-02500-7

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The process of identifying a spokesperson from a collection of subsequent time series data is referred to as speaker identification. Convolutional neural networks (CNNs) and deep neural networks are the two types of neural networks that are used in the majority of modern experimental approaches. This work presents a CNN model for speaker identification using a jump-connected one-dimensional convolutional neural network (1-D CNN) with a focus module (FM). The 1-D convolutional layer integrated with FM is employed in the presented model for speaker characteristic extraction and lessens heterogeneity in the temporal and spatial domains, allowing for quicker layer processing. Furthermore, the layered CNN hopping interconnection is employed to overcome the connectivity glitches, and a solution based on softmax loss and smooth L1-norm combined regulation is presented to increase efficiency. The recommended network model was evaluated using the ELSDSR, TIMIT, NIST, 16,000 PCM, and experimental audio datasets. According to experimental data, the equal error rate (EER) of end-to-end CNN for voiceprint identification is 9.02% higher than baseline approaches. In experiments, our proposed speaker recognition (SR) model, which we refer to as the deep FM-1D CNN, had a high recognition accuracy of 99.21%. Moreover, the observations demonstrate that the proposed network model is more robust than other models.

引用

页码：2817 / 2825

页数：9

共 50 条

[1] A focus module-based lightweight end-to-end CNN framework for voiceprint recognition
Karthikeyan Velayuthapandian
Suja Priyadharsini Subramoniam
Signal, Image and Video Processing, 2023, 17 : 2817 - 2825
[2] Research on End-to-end Voiceprint Recognition Model Based on Convolutional Neural Network
Hong Zhao
Yue, Lupeng
Wang, Weijie
Zeng Xiangyan
JOURNAL OF WEB ENGINEERING, 2021, 20 (05): : 1573 - 1585
[3] Lightweight End-to-End Stress Recognition using Binarized CNN-LSTM Models
Yun, Myeongji
Hong, Seungwoo
Yoo, Sunwoo
Kim, Junho
Park, Sung-Min
Lee, Youngjoo
2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 270 - 273
[4] Module-Based End-to-End Distant Speech Processing: A case study of far-field automatic speech recognition
Chang, Xuankai
Watanabe, Shinji
Delcroix, Marc
Ochiai, Tsubasa
Zhang, Wangyou
Qian, Yanmin
IEEE SIGNAL PROCESSING MAGAZINE, 2024, 41 (06) : 39 - 50
[5] Lightweight End-to-End Architecture for Streaming Speech Recognition
Yang S.
Li X.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (03): : 268 - 279
[6] End-to-End Speech Recognition Technology Based on Multi-Stream CNN
Xiao, Hao
Qiu, Yuan
Fei, Rong
Chen, Xiongbo
Liu, Zuo
Wu, Zongling
2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 1310 - 1315
[7] End-To-End Finger Trimodal Features Fusion and Recognition Model Based on CNN
Wen, Mengna
Zhang, Haigang
Yang, Jinfeng
BIOMETRIC RECOGNITION (CCBR 2021), 2021, 12878 : 39 - 48
[8] Evaluation of end-to-end CNN models for palm vein recognition
Santamaria, Jose, I
Hernandez-Garcia, Ruber
Barrientos, Ricardo J.
Manuel Castro, Francisco
Ramos-Cozar, Julian
Guil, Nicolas
2021 40TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2021,
[9] End-to-End Mandarin Speech Recognition Combining CNN and BLSTM
Wang, Dong
Wang, Xiaodong
Lv, Shaohe
SYMMETRY-BASEL, 2019, 11 (05):
[10] FlexCNN: An End-to-end Framework for Composing CNN Accelerators on FPGA
Basalama, Suhail
Sohrabizadeh, Atefeh
Wang, Jie
Guo, Licheng
Cong, Jason
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2023, 16 (02)

← 1 2 3 4 5 →