Convolutional neural network vectors for speaker recognition

被引:0
|
作者
Soufiane Hourri
Nikola S. Nikolov
Jamal Kharroubi
机构
[1] Laboratoire des Systèmes Intelligents et Applications,
[2] Faculté des Sciences et Techniques,undefined
[3] Université Sidi Mohamed Ben Abdellah,undefined
[4] University of Limerick,undefined
关键词
Speaker recognition; MFCC; Convolutional neural network; Restricted Boltzmann machine; Deep learning;
D O I
暂无
中图分类号
学科分类号
摘要
Deep learning models are now considered state-of-the-art in many areas of pattern recognition. In speaker recognition, several architectures have been studied, such as deep neural networks (DNNs), deep belief networks (DBNs), restricted Boltzmann machines (RBMs), and so on, while convolutional neural networks (CNNs) are the most widely used models in computer vision. The problem is that CNN is limited to the computer vision field due to its structure which is designed for two-dimensional data. To overcome this limitation, we aim at developing a customized CNN for speaker recognition. The goal of this paper is to propose a new approach to extract speaker characteristics by constructing CNN filters linked to the speaker. Besides, we propose new vectors to identify speakers, which we call in this work convVectors. Experiments have been performed with a gender-dependent corpus (THUYG-20 SRE) under three noise conditions : clean, 9db, and 0db. We compared the proposed method with our baseline system and the state-of-the-art methods. Results showed that the convVectors method was the most robust, improving the baseline system by an average of 43%, and recording an equal error rate of 1.05% EER. This is an important finding to understand how deep learning models can be adapted to the problem of speaker recognition.
引用
收藏
页码:389 / 400
页数:11
相关论文
共 50 条
  • [31] Deep Neural Network Approaches to Speaker and Language Recognition
    Richardson, Fred
    Reynolds, Douglas
    Dehak, Najim
    IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (10) : 1671 - 1675
  • [32] A Unified Deep Neural Network for Speaker and Language Recognition
    Richardson, Fred
    Reynolds, Doug
    Dehak, Najim
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1146 - 1150
  • [33] Speaker recognition method based on quantum neural network
    Wang, J.-M. (wjm_ice@163.com), 1600, University of Science and Technology (13):
  • [34] Neural Network Architectures for Speaker Independent Phoneme Recognition
    Cutajar, M.
    Gatt, E.
    Grech, I
    Casha, O.
    Micallef, J.
    PROCEEDINGS OF THE 7TH INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS (ISPA 2011), 2011, : 90 - 94
  • [35] Speaker independent voice recognition with a fuzzy neural network
    Nava, PA
    Taylor, JM
    FUZZ-IEEE '96 - PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, 1996, : 2049 - 2052
  • [36] Speaker Recognition and Verification Using Artificial Neural Network
    Chauhan, Neha
    Chandra, Mahesh
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2017, : 1147 - 1149
  • [37] Gait Recognition Using Convolutional Neural Network
    Sheth, Abhishek
    Sharath, Meghana
    Reddy, Sai Charan
    Sindhu, K.
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2023, 19 (01) : 107 - 118
  • [38] Target Recognition Based on Convolutional Neural Network
    Wang Liqiang
    Wang Xin
    Xi Fubiao
    Dong Jian
    LIDAR IMAGING DETECTION AND TARGET RECOGNITION 2017, 2017, 10605
  • [39] Flower Recognition Based on Convolutional Neural Network
    Zhang, Xu
    Han, Ding
    Bai, Fengshan
    Ma, Ziyin
    2019 9TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST2019), 2019, : 333 - 338
  • [40] Road sign recognition with Convolutional Neural Network
    Bouti, Amal
    Mahraz, Mohamed Adnane
    Riffi, Jamal
    Tairi, Hamid
    2018 INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND COMPUTER VISION (ISCV2018), 2018,