Children's Emotion Recognition from Spontaneous Speech Using a Reduced Set of Acoustic and Linguistic Features

被引:9
|
作者
Planet, Santiago [1 ]
Iriondo, Ignasi [1 ]
机构
[1] Univ Ramon Llull, Barcelona 08022, Spain
关键词
Emotion recognition; Spontaneous speech; Acoustic and linguistic features; Feature selection; Feature-level fusion; Speaker-independent;
D O I
10.1007/s12559-012-9174-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The aim of this article is to classify children's affective states in a real-life non-prototypical emotion recognition scenario. The framework is the same as that proposed in the Interspeech 2009 Emotion Challenge. We used a large set of acoustic features and five linguistic parameters based on the concept of emotional salience. Features were extracted from the spontaneous speech recordings of the FAU Aibo Corpus and their transcriptions. We used a wrapper method to reduce the acoustic set of features from 384 to 28 elements and feature-level fusion to merge them with the set of linguistic parameters. We studied three classification approaches: a Naive-Bayes classifier, a support vector machine and a logistic model tree. Results show that the linguistic features improve the performances of the classifiers that use only acoustic datasets. Additionally, merging the linguistic features with the reduced acoustic set is more effective than working with the full dataset. The best classifier performance is achieved with the logistic model tree and the reduced set of acoustic and linguistic features, which improves the performance obtained with the full dataset by 4.15 % absolute (10.14 % relative) and improves the performance of the Naive-Bayes classifier by 9.91 % absolute (28.18 % relative). For the same conditions proposed in the Emotion Challenge, this simple scheme slightly improves a much more complex structure involving seven classifiers and a larger number of features.
引用
收藏
页码:526 / 532
页数:7
相关论文
共 50 条
  • [31] Emotion Recognition from Spontaneous Slavic Speech
    Atassi, Hicham
    Smekal, Zdenek
    Esposito, Anna
    3RD IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2012), 2012, : 389 - 394
  • [32] Exploring the benefits of discretization of acoustic features for speech emotion recognition
    Vogt, Thurid
    Andre, Elisabeth
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 348 - 351
  • [33] Ensemble Learning of Hybrid Acoustic Features for Speech Emotion Recognition
    Zvarevashe, Kudakwashe
    Olugbara, Oludayo
    ALGORITHMS, 2020, 13 (03)
  • [34] Combining acoustic features for improved emotion recognition in Mandarin speech
    Pao, TL
    Chen, YT
    Yeh, JH
    Liao, WY
    AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PROCEEDINGS, 2005, 3784 : 279 - 285
  • [35] Speech Emotion Recognition using Combination of Features
    Zhang, Qingli
    An, Ning
    Wang, Kunxia
    Ren, Fuji
    Li, Lian
    PROCEEDINGS OF THE 2013 FOURTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2013, : 523 - 528
  • [36] Detecting Alzheimer's Disease using Interactional and Acoustic features from spontaneous speech
    Nasreen, Shamila
    Hough, Julian
    Purver, Matthew
    INTERSPEECH 2021, 2021, : 1962 - 1966
  • [37] Emotion Recognition using Acoustic and Lexical Features
    Rozgic, Viktor
    Ananthakrishnan, Sankaranarayanan
    Saleem, Shirin
    Kumar, Rohit
    Vembu, Aravind Namandi
    Prasad, Rohit
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 366 - 369
  • [38] An optimal two stage feature selection for speech emotion recognition using acoustic features
    Kuchibhotla S.
    Vankayalapati H.D.
    Anne K.R.
    International Journal of Speech Technology, 2016, 19 (4) : 657 - 667
  • [39] Automatic Emotion Recognition in Compressed Speech Using Acoustic and Non-Linear Features
    Garcia, N.
    Vasquez-Correa, J. C.
    Arias-Londono, J. D.
    Vargas-Bonilla, J. F.
    Orozco-Arroyave, J. R.
    2015 20TH SYMPOSIUM ON SIGNAL PROCESSING, IMAGES AND COMPUTER VISION (STSIVA), 2015,
  • [40] ON THE USE OF SELF-SUPERVISED PRE-TRAINED ACOUSTIC AND LINGUISTIC FEATURES FOR CONTINUOUS SPEECH EMOTION RECOGNITION
    Macary, Manon
    Tahon, Marie
    Esteve, Yannick
    Rousseau, Anthony
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 373 - 380