A focus module-based lightweight end-to-end CNN framework for voiceprint recognition

被引:9
|
作者
Velayuthapandian, Karthikeyan [1 ]
Subramoniam, Suja Priyadharsini [2 ]
机构
[1] Mepco Schlenk Engn Coll, Dept Elect & Commun Engn, Sivakasi, Tamil Nadu, India
[2] Anna Univ Reg Campus, Dept Elect & Commun Engn, Tirunelveli, Tamil Nadu, India
关键词
Speaker recognition; Deep neural network; Spectrogram; 1-D CNN; Focus module; SUPPORT VECTOR MACHINES; SPEAKER; SYSTEM;
D O I
10.1007/s11760-023-02500-7
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The process of identifying a spokesperson from a collection of subsequent time series data is referred to as speaker identification. Convolutional neural networks (CNNs) and deep neural networks are the two types of neural networks that are used in the majority of modern experimental approaches. This work presents a CNN model for speaker identification using a jump-connected one-dimensional convolutional neural network (1-D CNN) with a focus module (FM). The 1-D convolutional layer integrated with FM is employed in the presented model for speaker characteristic extraction and lessens heterogeneity in the temporal and spatial domains, allowing for quicker layer processing. Furthermore, the layered CNN hopping interconnection is employed to overcome the connectivity glitches, and a solution based on softmax loss and smooth L1-norm combined regulation is presented to increase efficiency. The recommended network model was evaluated using the ELSDSR, TIMIT, NIST, 16,000 PCM, and experimental audio datasets. According to experimental data, the equal error rate (EER) of end-to-end CNN for voiceprint identification is 9.02% higher than baseline approaches. In experiments, our proposed speaker recognition (SR) model, which we refer to as the deep FM-1D CNN, had a high recognition accuracy of 99.21%. Moreover, the observations demonstrate that the proposed network model is more robust than other models.
引用
收藏
页码:2817 / 2825
页数:9
相关论文
共 50 条
  • [41] An end-to-end gait recognition system for covariate conditions using custom kernel CNN
    Ali, Babar
    Bukhari, Maryam
    Maqsood, Muazzam
    Moon, Jihoon
    Hwang, Eenjun
    Rho, Seungmin
    HELIYON, 2024, 10 (12)
  • [42] A real-time face detector based on an end-to-end CNN
    Zheng, Chenghao
    Yang, Menglong
    Wang, Chengpeng
    2017 10TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL. 1, 2017, : 393 - 397
  • [43] An end-to-end hand action recognition framework based on cross-time mechanomyography signals
    Zhang, Yue
    Li, Tengfei
    Zhang, Xingguo
    Xia, Chunming
    Zhou, Jie
    Sun, Maoxun
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (05) : 6953 - 6964
  • [44] Accurate iris segmentation and recognition using an end-to-end unified framework based on MADNet and DSANet
    Chen, Ying
    Gan, Huimin
    Chen, Huiling
    Zeng, Yugang
    Xu, Liang
    Heidari, Ali Asghar
    Zhu, Xiaodong
    Liu, Yuanning
    NEUROCOMPUTING, 2023, 517 : 264 - 278
  • [45] A hybrid CTC+Attention model based on end-to-end framework for multilingual speech recognition
    Sendong Liang
    Wei Qi Yan
    Multimedia Tools and Applications, 2022, 81 : 41295 - 41308
  • [46] A Text Detection and Recognition System based on an End-to-End Trainable Framework from UAV Imagery
    Wu, Qingtian
    Zhou, Yimin
    Liang, Guoyuan
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2018, : 736 - 741
  • [47] A hybrid CTC plus Attention model based on end-to-end framework for multilingual speech recognition
    Liang, Sendong
    Yan, Wei Qi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (28) : 41295 - 41308
  • [48] Lightweight End-to-End Blockchain for IoT Applications
    Lee, Seungcheol
    Lee, Jaehyun
    Hong, Sengphil
    Kim, Jae-Hoon
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (08) : 3224 - 3242
  • [49] Attention-based end-to-end CNN framework for content-based X-ray image retrieval
    Ozturk, Saban
    Alhudhaif, Adi
    Polat, Kemal
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 : 2680 - 2693
  • [50] LIGHTWEIGHT AND EFFICIENT END-TO-END SPEECH RECOGNITION USING LOW-RANK TRANSFORMER
    Winata, Genta Indra
    Cahyawijaya, Samuel
    Lin, Zhaojiang
    Liu, Zihan
    Fung, Pascale
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6144 - 6148