Prominence features: Effective emotional features for speech emotion recognition

被引：43

作者：

Jing, Shaoling ^{[1
]}

Mao, Xia ^{[1
]}

Chen, Lijiang ^{[1
]}

机构：

[1] Beihang Univ, Sch Elect & Informat Engn, Beijing 100191, Peoples R China

来源：

DIGITAL SIGNAL PROCESSING | 2018年 / 72卷

基金：

中国国家自然科学基金;

关键词：

Prominence features; Speech annotation; Consistency assessment; Speech emotion recognition; FUNDAMENTAL-FREQUENCY; PERCEIVED PROMINENCE; AGREEMENT;

D O I：

10.1016/j.dsp.2017.10.016

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Emotion-related feature extraction is a challenging task in speech emotion recognition. Due to the lack of discriminative acoustic features, classical approaches based on traditional acoustic features could not provide satisfactory performances. This research proposes a novel type of feature related to prominence, which, together with traditional acoustic features, are used to classify seven typical different emotional states. To this end, the author group produces a Chinese Dual-mode Emotional Speech Database (CDESD), which contains additional prominence and paralinguistic annotation information. Then, a consistency assessment algorithm is presented to validate the reliability of the annotation information of this database. The results show that the annotation consistency on prominence reaches more than 60% on average. Subsequently, this research analyzes the correlation of the prominence features with emotional states using a curve fitting method. Prominence is found to be closely related to emotion states, to retain emotional information at the word level to the greatest possible extent and to play an important role in emotional expression. Finally, the proposed prominence features are validated on CDESD through speaker dependent and speaker-independent experiments with four commonly used classifiers. The results show that the average recognition rate achieved using the combined features is improved by 6% in speaker dependent experiments and by 6.2% in speaker-independent experiments compared with that achieved using only acoustic features. (C) 2017 Elsevier Inc. All rights reserved.

引用

页码：216 / 231

页数：16

共 50 条

[21] Emotional speech recognition: Resources, features, and methods
Ververidis, Dimitrios
Kotropoulos, Constantine
SPEECH COMMUNICATION, 2006, 48 (09) : 1162 - 1181
[22] Speech Databases, Speech Features, and Classifiers in Speech Emotion Recognition: A Review
Dar, G. H. Mohmad
Delhibabu, Radhakrishnan
IEEE ACCESS, 2024, 12 : 151122 - 151152
[23] Effective Geometric Features for Human Emotion Recognition
Saeed, Anwar
Al-Hamadi, Ayoub
Niese, Robert
Elzobi, Moftah
PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 623 - 627
[24] On the Correlation and Transferability of Features between Automatic Speech Recognition and Speech Emotion Recognition
Fayek, Haytham M.
Lech, Margaret
Cavedon, Lawrence
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3618 - 3622
[25] Emotion Recognition in Speech Using MFCC and Wavelet Features
Kishore, K. V. Krishna
Satish, P. Krishna
PROCEEDINGS OF THE 2013 3RD IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2013, : 842 - 847
[26] Speech Emotion Recognition Considering Local Dynamic Features
Guan, Haotian
Liu, Zhilei
Wang, Longbiao
Dang, Jianwu
Yu, Ruiguo
STUDIES ON SPEECH PRODUCTION, 2018, 10733 : 14 - 23
[27] Speech emotion recognition using nonlinear dynamics features
Shahzadi, Ali
Ahmadyfard, Alireza
Harimi, Ali
Yaghmaie, Khashayar
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2015, 23 : 2056 - 2073
[28] Speech Emotion Recognition Using Minimum Extracted Features
Abdulsalam, Wisal Hashim
Alhamdani, Rafah Shihab
Abdullah, Mohammed Najm
2018 1ST ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION AND SCIENCES (AICIS 2018), 2018, : 58 - 61
[29] Amplitude Modulation Features for Emotion Recognition from Speech
Alam, Md Jahangir
Attabi, Yazid
Dumouchel, Pierre
Kenny, Patrick
O'Shaughnessy, D.
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2419 - 2423
[30] Speech Emotion Recognition Using Magnitude and Phase Features
Shankar D.R.
Manjula R.B.
Biradar R.C.
SN Computer Science, 5 (5)

← 1 2 3 4 5 →