Speech Databases, Speech Features, and Classifiers in Speech Emotion Recognition: A Review

被引:0
|
作者
Dar, G. H. Mohmad [1 ]
Delhibabu, Radhakrishnan [2 ]
机构
[1] Vellore Inst Technol, Sch Adv Sci, Vellore 632014, Tamil Nadu, India
[2] Vellore Inst Technol, Sch Comp Sci & Engn, Vellore 632014, Tamil Nadu, India
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Speech emotion recognition; machine learning; deep learning; affective computing; support vector machine; random forest; Gaussian mixture model; audio features; databases; classifiers; COMMUNICATING EMOTION; INFORMATION FUSION; REPRESENTATIONS; IMPLEMENTATION; AUTOENCODER; GENERATION; DEPRESSION; EXPRESSION; NETWORKS; VALENCE;
D O I
10.1109/ACCESS.2024.3476960
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Emotion recognition from speech signals plays a crucial role in Human-Machine Interaction (HMI), particularly in the development of applications such as affective computing and interactive systems. This review seeks to provide an in-depth examination of current methodologies in speech emotion recognition (SER), with a focus on databases, feature extraction techniques, and classification models. It has been done in the past using low-level descriptors (LLDs) like Mel-Frequency Cepstral Coefficients (MFCCs), linear predictive coding (LPC), and pitch-based features in methods like Support Vector Machines (SVM), Random Forests (RF), and Gaussian Mixture Models (GMM). But the development of deep learning techniques has completely changed the field. Models like convolutional neural networks (CNNs) and long short-term memory (LSTM) networks have shown that they are better at capturing the complex temporal and spectral features of speech. This paper reviews prominent speech emotion datasets, exploring their linguistic diversity, annotation processes, and emotional labels. It also analyzes the efficacy of different speech features and classifiers in handling challenges such as data imbalance, limited data availability, and cross-lingual variations. The review highlights the need for future work to address real-time processing, context-sensitive emotion detection, and the integration of multi-modal data to enhance the performance of SER systems. By consolidating recent advancements and identifying areas for further research, this paper aims to provide a clearer path for optimizing feature extraction and classification techniques in the field of emotion recognition.
引用
收藏
页码:151122 / 151152
页数:31
相关论文
共 50 条
  • [1] Databases, features and classifiers for speech emotion recognition: a review
    Swain, Monorama
    Routray, Aurobinda
    Kabisatpathy, P.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (01) : 93 - 120
  • [2] Emotion Recognition from Speech by Combining Databases and Fusion of Classifiers
    Lefter, Iulia
    Rothkrantz, Leon J. M.
    Wiggers, Pascal
    van Leeuwen, David A.
    TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 353 - +
  • [3] Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers
    Akcay, Mehmet Berkehan
    Oguz, Kaya
    SPEECH COMMUNICATION, 2020, 116 (116) : 56 - 76
  • [4] Analyzing the influence of different speech data corpora and speech features on speech emotion recognition: A review
    Rathi, Tarun
    Tripathy, Manoj
    SPEECH COMMUNICATION, 2024, 162
  • [5] Survey on speech emotion recognition: Features, classification schemes, and databases
    El Ayadi, Moataz
    Kamel, Mohamed S.
    Karray, Fakhri
    PATTERN RECOGNITION, 2011, 44 (03) : 572 - 587
  • [6] Emotion Recognition in Speech Using MFCC and Classifiers
    Ajitha, G.
    Prashanth, Addagatla
    Radhika, Chelle
    Chaitanya, Kancharapu
    COMPUTATIONAL VISION AND BIO-INSPIRED COMPUTING ( ICCVBIC 2021), 2022, 1420 : 197 - 207
  • [7] Speech Emotion Recognition Using Multiple Classifiers
    Wang, Kunxia
    Chu, Zongcheng
    Wang, Kai
    Yu, Tongqing
    Liu, Li
    WEB AND BIG DATA, 2017, 10612 : 84 - 93
  • [8] Review on speech emotion recognition
    Han, W.-J. (hanwenjing07@gmail.com), 1600, Chinese Academy of Sciences (25):
  • [9] A REVIEW ON SPEECH EMOTION FEATURES
    Zaidan, Noor Aina
    Salam, Md Sah Hj.
    JURNAL TEKNOLOGI, 2015, 75 (02): : 89 - 95
  • [10] A Perspective Study on Speech Emotion Recognition: Databases, Features and Classification Models
    Raghu, Kogila
    Sadanandam, Manchala
    TRAITEMENT DU SIGNAL, 2021, 38 (06) : 1861 - 1873