A focus module-based lightweight end-to-end CNN framework for voiceprint recognition

被引:9
|
作者
Velayuthapandian, Karthikeyan [1 ]
Subramoniam, Suja Priyadharsini [2 ]
机构
[1] Mepco Schlenk Engn Coll, Dept Elect & Commun Engn, Sivakasi, Tamil Nadu, India
[2] Anna Univ Reg Campus, Dept Elect & Commun Engn, Tirunelveli, Tamil Nadu, India
关键词
Speaker recognition; Deep neural network; Spectrogram; 1-D CNN; Focus module; SUPPORT VECTOR MACHINES; SPEAKER; SYSTEM;
D O I
10.1007/s11760-023-02500-7
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The process of identifying a spokesperson from a collection of subsequent time series data is referred to as speaker identification. Convolutional neural networks (CNNs) and deep neural networks are the two types of neural networks that are used in the majority of modern experimental approaches. This work presents a CNN model for speaker identification using a jump-connected one-dimensional convolutional neural network (1-D CNN) with a focus module (FM). The 1-D convolutional layer integrated with FM is employed in the presented model for speaker characteristic extraction and lessens heterogeneity in the temporal and spatial domains, allowing for quicker layer processing. Furthermore, the layered CNN hopping interconnection is employed to overcome the connectivity glitches, and a solution based on softmax loss and smooth L1-norm combined regulation is presented to increase efficiency. The recommended network model was evaluated using the ELSDSR, TIMIT, NIST, 16,000 PCM, and experimental audio datasets. According to experimental data, the equal error rate (EER) of end-to-end CNN for voiceprint identification is 9.02% higher than baseline approaches. In experiments, our proposed speaker recognition (SR) model, which we refer to as the deep FM-1D CNN, had a high recognition accuracy of 99.21%. Moreover, the observations demonstrate that the proposed network model is more robust than other models.
引用
收藏
页码:2817 / 2825
页数:9
相关论文
共 50 条
  • [31] An end-to-end RNS CNN Accelerator
    Sakellariou, Vasilis
    Paliouras, Vassilis
    Kouretas, Ioannis
    Saleh, Hani
    Stouraitis, Thanos
    2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 75 - 79
  • [32] An End-to-End Deep Learning Framework for Wideband Signal Recognition
    Vagollari, Adela
    Hirschbeck, Martin
    Gerstacker, Wolfgang
    IEEE ACCESS, 2023, 11 : 52899 - 52922
  • [33] CNN-based End-to-End Learning for Lane Centering
    Ebu, Iffat Ara
    Islam, Fahmida
    Ball, John E.
    Goodin, Christopher T.
    AUTONOMOUS SYSTEMS:SENSORS, PROCESSING, AND SECURITY FOR GROUND, AIR, SEA, AND SPACE VEHICLES AND INFRASTRUCTURE 2024, 2024, 13052
  • [34] Use AF-CNN for End-to-End Fiber Vibration Signal Recognition
    Ruan, Saisai
    Mo, Jiaqing
    Xu, Liang
    Zhou, Gang
    Liu, Yajun
    Zhang, Xin
    IEEE ACCESS, 2021, 9 : 6713 - 6720
  • [35] Lightweight Transformer based end-to-end speech recognition with patch adaptive local dense synthesizer attention
    Tang, Peiyuan
    Li, Penghua
    Liu, Shengwei
    Xu, HaoXiang
    PROCEEDINGS OF THE 36TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC 2024, 2024, : 4978 - 4983
  • [36] An End-to-End Depression Recognition Method Based on EEGNet
    Liu, Bo
    Chang, Hongli
    Peng, Kang
    Wang, Xuenan
    FRONTIERS IN PSYCHIATRY, 2022, 13
  • [37] Exploring end-to-end framework towards Khasi speech recognition system
    Bronson Syiem
    L. Joyprakash Singh
    International Journal of Speech Technology, 2021, 24 : 419 - 424
  • [38] Exploring end-to-end framework towards Khasi speech recognition system
    Syiem, Bronson
    Singh, L. Joyprakash
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (02) : 419 - 424
  • [39] End-to-end Gesture Recognition Framework for the Identification of Allergic Rhinitis Symptoms
    Tzamalis, Pantelis
    Bardoutsos, Andreas
    Markantonatos, Dimitris
    Raptopoulos, Christoforos
    Nikoletseas, Sotiris
    Aggelides, Xenophon
    Papadopoulos, Nikos
    18TH ANNUAL INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SENSOR SYSTEMS (DCOSS 2022), 2022, : 25 - 34
  • [40] Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding
    Zhen, Kai
    Sung, Jongmo
    Lee, Mi Suk
    Beack, Seungkwon
    Kim, Minje
    INTERSPEECH 2019, 2019, : 3396 - 3400