Efficient speaker identification using spectral entropy

被引:0
|
作者
Fernando Luque-Suárez
Antonio Camarena-Ibarrola
Edgar Chávez
机构
[1] CICESE,
[2] Universidad Michoacana,undefined
来源
关键词
Speaker recognition; Speaker identification; Entropygrams;
D O I
暂无
中图分类号
学科分类号
摘要
In voice recognition, the two main problems are speech recognition (what was said), and speaker recognition (who was speaking). The usual method for speaker recognition is to postulate a model where the speaker identity corresponds to the parameters of the model, which estimation could be time-consuming when the number of candidate speakers is large. In this paper, we model the speaker as a high dimensional point cloud of entropy-based features, extracted from the speech signal. The method allows indexing, and hence it can manage large databases. We experimentally assessed the quality of the identification with a publicly available database formed by extracting audio from a collection of YouTube videos of 1,000 different speakers. With 20 second audio excerpts, we were able to identify a speaker with 97% accuracy when the recording environment is not controlled, and with 99% accuracy for controlled recording environments.
引用
收藏
页码:16803 / 16815
页数:12
相关论文
共 50 条
  • [31] Speaker identification using cepstral analysis
    Nazar, MN
    ISCON 2002: IEEE STUDENTS CONFERENCE ON EMERGING TECHNOLOGIES, PROCEEDINGS, 2002, : 139 - 143
  • [32] Speaker identification using instantaneous frequencies
    Grimaldi, Marco
    Cummins, Fred
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (06): : 1097 - 1111
  • [33] Using cohorts to improve speaker identification
    Mashao, DJ
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XIII, PROCEEDINGS: INDUSTRIAL SYSTEMS, 2004, : 261 - 266
  • [34] Wavelet entropy and neural network for text-independent speaker identification
    Daqrouq, Khaled
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2011, 24 (05) : 796 - 802
  • [35] Speaker Identification using Neural Networks
    Pawar, R. V.
    Kajave, P. P.
    Mali, S. N.
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 7, 2005, 7 : 429 - 433
  • [36] Speaker Identification Using Bagging Techniques
    Indumathi, A.
    Chandra, E.
    2015 INTERNATIONAL CONFERENCE ON COMPUTERS, COMMUNICATIONS, AND SYSTEMS (ICCCS), 2015, : 223 - 229
  • [37] Speaker Identification using Whispered Speech
    Jawarkar, Naresh P.
    Holambe, Raghunath S.
    Basu, Tapan Kumar
    2013 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT 2013), 2013, : 778 - 781
  • [38] Bayesian Spectral Decomposition for Efficient Modal Identification Using Ambient Vibration
    Feng, Zhouquan
    Zhang, Jiren
    Katafygiotis, Lambros
    Hua, Xugang
    Chen, Zhengqing
    STRUCTURAL CONTROL & HEALTH MONITORING, 2024, 2024
  • [39] Speaker Modeling Using Emotional Speech for More Robust Speaker Identification
    M. Milošević
    Ž. Nedeljković
    U. Glavitsch
    Ž. Đurović
    Journal of Communications Technology and Electronics, 2019, 64 : 1256 - 1265
  • [40] A Novel Speech Enhancement Method Using Fourier Series Decomposition and Spectral Subtraction for Robust Speaker Identification
    Siam, Ali, I
    El-khobby, Heba A.
    Abd Elnaby, Mustafa M.
    Abdelkader, Hatem S.
    Abd El-Samie, Fathi E.
    WIRELESS PERSONAL COMMUNICATIONS, 2019, 108 (02) : 1055 - 1068