Speaker identification in emotional talking environments based on CSPHMM2s

被引:24
|
作者
Shahin, Ismail [1 ]
机构
[1] Univ Sharjah, Dept Elect & Comp Engn, Sharjah, U Arab Emirates
关键词
Emotional talking environments; Hidden Markov models; Second-order circular suprasegmental hidden Markov models; Speaker identification; Suprasegmental hidden Markov models; RECOGNITION; SPEECH;
D O I
10.1016/j.engappai.2013.03.013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speaker recognition systems perform almost ideal in neutral talking environments; however, these systems perform poorly in emotional talking environments. This research is devoted to enhancing the low performance of text-independent and emotion-dependent speaker identification in emotional talking environments based on employing Second-Order Circular Suprasegmental Hidden Markov Models (CSPHMM2s) as classifiers. This work has been tested on our speech database which is composed of 50 speakers talking in six different emotional states. These states are neutral, angry, sad, happy, disgust, and fear. Our results show that the average speaker identification performance in these talking environments based on CSPHMM25 is 81.50% with an improvement rate of 5.61%, 339%, and 3.06% compared, respectively, to First-Order Left-to-Right Suprasegmental Hidden Markov Models (LTRSPHMM1s), Second-Order Left-to-Right Suprasegmental Hidden Markov Models (LTRSPHMM2s), and First-Order Circular Suprasegmental Hidden Markov Models (CSPHMM1s). Our results based on subjective evaluation by human judges fall within 2.26% of those obtained based on CSPHMM2s. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1652 / 1659
页数:8
相关论文
共 50 条
  • [21] Speaker Identification in Shouted Talking Environments Based on Novel Third-Order Hidden Markov Models
    Shahin, Ismail
    2014 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), VOLS 1-2, 2014, : 352 - 357
  • [22] Emirati-accented speaker identification in each of neutral and shouted talking environments
    Shahin I.
    Nassif A.B.
    Bahutair M.
    International Journal of Speech Technology, 2018, 21 (2) : 265 - 278
  • [23] A Novel RBFNN-CNN Model for Speaker Identification in Stressful Talking Environments
    Nassif, Ali Bou
    Alnazzawi, Noha
    Shahin, Ismail
    Salloum, Said A.
    Hindawi, Noor
    Lataifeh, Mohammed
    Elnagar, Ashraf
    APPLIED SCIENCES-BASEL, 2022, 12 (10):
  • [24] Text-Independent Emirati-Accented Speaker Identification in Emotional Talking Environment
    Shahin, Ismail
    2018 FIFTH HCT INFORMATION TECHNOLOGY TRENDS (ITT): EMERGING TECHNOLOGIES FOR ARTIFICIAL INTELLIGENCE, 2018, : 257 - 262
  • [25] CASA-based speaker identification using cascaded GMM-CNN classifier in noisy and emotional talking conditions
    Nassif, Ali Bou
    Shahin, Ismail
    Hamsa, Shibani
    Nemmour, Nawel
    Hirose, Keikichi
    APPLIED SOFT COMPUTING, 2021, 103 (103)
  • [26] Text-Independent Speaker Identification in Emotional Environments: A Classifier Fusion Approach
    Jawarkar, N. P.
    Holambe, R. S.
    Basu, T. K.
    FRONTIERS IN COMPUTER EDUCATION, 2012, 133 : 569 - +
  • [27] Novel third-order hidden Markov models for speaker identification in shouted talking environments
    Shahin, Ismail
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2014, 35 : 316 - 323
  • [28] Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation
    Li, Lincheng
    Wang, Suzhen
    Zhang, Zhimeng
    Ding, Yu
    Zheng, Yixing
    Yu, Xin
    Fan, Changjie
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1911 - 1920
  • [29] Employing Second-Order Circular Suprasegmental Hidden Markov Models to Enhance Speaker Identification Performance in Shouted Talking Environments
    Shahin, Ismail
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2010,
  • [30] Employing Second-Order Circular Suprasegmental Hidden Markov Models to Enhance Speaker Identification Performance in Shouted Talking Environments
    Ismail Shahin
    EURASIP Journal on Audio, Speech, and Music Processing, 2010