Prominence features: Effective emotional features for speech emotion recognition

被引:43
|
作者
Jing, Shaoling [1 ]
Mao, Xia [1 ]
Chen, Lijiang [1 ]
机构
[1] Beihang Univ, Sch Elect & Informat Engn, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Prominence features; Speech annotation; Consistency assessment; Speech emotion recognition; FUNDAMENTAL-FREQUENCY; PERCEIVED PROMINENCE; AGREEMENT;
D O I
10.1016/j.dsp.2017.10.016
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Emotion-related feature extraction is a challenging task in speech emotion recognition. Due to the lack of discriminative acoustic features, classical approaches based on traditional acoustic features could not provide satisfactory performances. This research proposes a novel type of feature related to prominence, which, together with traditional acoustic features, are used to classify seven typical different emotional states. To this end, the author group produces a Chinese Dual-mode Emotional Speech Database (CDESD), which contains additional prominence and paralinguistic annotation information. Then, a consistency assessment algorithm is presented to validate the reliability of the annotation information of this database. The results show that the annotation consistency on prominence reaches more than 60% on average. Subsequently, this research analyzes the correlation of the prominence features with emotional states using a curve fitting method. Prominence is found to be closely related to emotion states, to retain emotional information at the word level to the greatest possible extent and to play an important role in emotional expression. Finally, the proposed prominence features are validated on CDESD through speaker dependent and speaker-independent experiments with four commonly used classifiers. The results show that the average recognition rate achieved using the combined features is improved by 6% in speaker dependent experiments and by 6.2% in speaker-independent experiments compared with that achieved using only acoustic features. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:216 / 231
页数:16
相关论文
共 50 条
  • [21] Emotional speech recognition: Resources, features, and methods
    Ververidis, Dimitrios
    Kotropoulos, Constantine
    SPEECH COMMUNICATION, 2006, 48 (09) : 1162 - 1181
  • [22] Speech Databases, Speech Features, and Classifiers in Speech Emotion Recognition: A Review
    Dar, G. H. Mohmad
    Delhibabu, Radhakrishnan
    IEEE ACCESS, 2024, 12 : 151122 - 151152
  • [23] Effective Geometric Features for Human Emotion Recognition
    Saeed, Anwar
    Al-Hamadi, Ayoub
    Niese, Robert
    Elzobi, Moftah
    PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 623 - 627
  • [24] On the Correlation and Transferability of Features between Automatic Speech Recognition and Speech Emotion Recognition
    Fayek, Haytham M.
    Lech, Margaret
    Cavedon, Lawrence
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3618 - 3622
  • [25] Emotion Recognition in Speech Using MFCC and Wavelet Features
    Kishore, K. V. Krishna
    Satish, P. Krishna
    PROCEEDINGS OF THE 2013 3RD IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2013, : 842 - 847
  • [26] Speech Emotion Recognition Considering Local Dynamic Features
    Guan, Haotian
    Liu, Zhilei
    Wang, Longbiao
    Dang, Jianwu
    Yu, Ruiguo
    STUDIES ON SPEECH PRODUCTION, 2018, 10733 : 14 - 23
  • [27] Speech emotion recognition using nonlinear dynamics features
    Shahzadi, Ali
    Ahmadyfard, Alireza
    Harimi, Ali
    Yaghmaie, Khashayar
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2015, 23 : 2056 - 2073
  • [28] Speech Emotion Recognition Using Minimum Extracted Features
    Abdulsalam, Wisal Hashim
    Alhamdani, Rafah Shihab
    Abdullah, Mohammed Najm
    2018 1ST ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION AND SCIENCES (AICIS 2018), 2018, : 58 - 61
  • [29] Amplitude Modulation Features for Emotion Recognition from Speech
    Alam, Md Jahangir
    Attabi, Yazid
    Dumouchel, Pierre
    Kenny, Patrick
    O'Shaughnessy, D.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2419 - 2423
  • [30] Speech Emotion Recognition Using Magnitude and Phase Features
    Shankar D.R.
    Manjula R.B.
    Biradar R.C.
    SN Computer Science, 5 (5)