A DCRNN-based ensemble classifier for speech emotion recognition in Odia language

被引:0
|
作者
Monorama Swain
Bubai Maji
P. Kabisatpathy
Aurobinda Routray
机构
[1] Silicon Institute of Technology,Department of Electronics and Communication Engineering
[2] CV Raman College of Engineering,Department of Electronics and Instrumentation
[3] Indian Institute of Technology,Department of Electrical Engineering
来源
Complex & Intelligent Systems | 2022年 / 8卷
关键词
Speech emotion recognition; Deep convolutional neural network; Bi-directional gated recurrent unit; Ensemble classifier;
D O I
暂无
中图分类号
学科分类号
摘要
The Odia language is an old Eastern Indo-Aryan language, spoken by 46.8 million people across India. We have designed an ensemble classifier using Deep Convolutional Recurrent Neural Network for Speech Emotion Recognition (SER). This study presents a new approach for SER tasks motivated by recent research on speech emotion recognition. Initially, we extract utterance-level log Mel-spectrograms and their first and second derivative (Static, Delta, and Delta-delta), represented as 3-D log Mel-spectrograms. We utilize deep convolutional neural networks deep convolutional neural networks to extract the deep features from 3-D log Mel-spectrograms. Then a bi-directional-gated recurrent unit network is applied to express long-term temporal dependency out of all features to produce utterance-level emotion. Finally, we use ensemble classifiers using Softmax and Support Vector Machine classifier to improve the final recognition rate. In this way, our proposed framework is trained and tested on Odia (Seven emotional states) and RAVDESS (Eight emotional states) dataset. The experimental results reveal that an ensemble classifier performs better instead of a single classifier. The accuracy levels reached are 85.31% and 77.54%, outperforming some state-of-the-art frameworks on the Odia and RAVDESS datasets.
引用
收藏
页码:4237 / 4249
页数:12
相关论文
共 50 条
  • [1] A DCRNN-based ensemble classifier for speech emotion recognition in Odia language
    Swain, Monorama
    Maji, Bubai
    Kabisatpathy, P.
    Routray, Aurobinda
    COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (05) : 4237 - 4249
  • [2] Ensemble majority voting classifier for speech emotion recognition and prediction
    Anagnostopoulos, Theodoros
    Skourlas, Christos
    Journal of Systems and Information Technology, 2014, 16 (03) : 222 - 232
  • [3] Ensemble Classifier based on Decision-Fusion of Multiple Models for Speech Emotion Recognition
    Noh, Kyoungju
    Lim, Jiyoun
    Chung, Seungeun
    Kim, Gague
    Jeong, Hyuntae
    2018 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2018, : 1246 - 1248
  • [4] Multi-language: ensemble learning-based speech emotion recognition
    Sruthi, Anumula
    Kumar, Anumula Kalyan
    Dasari, Kishore
    Sivaramaiah, Yenugu
    Divya, Garikapati
    Kumar, Gunupudi Sai Chaitanya
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 19 (3) : 453 - 467
  • [5] Study of prosodic feature extraction for multidialectal Odia speech emotion recognition
    Swain, Monorama
    Routray, Aurobinda
    Kabisatpathy, P.
    Kundu, Jogendra N.
    PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 1644 - 1649
  • [6] Automatic Speech Recognition Based Odia System
    Karan, Biswajit
    Sahoo, Jayaprakash
    Sahu, P. K.
    2015 INTERNATIONAL CONFERENCE ON MICROWAVE, OPTICAL AND COMMUNICATION ENGINEERING (ICMOCE), 2015, : 353 - 356
  • [7] Convolutional Gated Recurrent Units (CGRU) for Emotion Recognition in Odia Language
    Swain, Monorama
    Maji, Bubai
    Das, Umasankar
    IEEE EUROCON 2021 - 19TH INTERNATIONAL CONFERENCE ON SMART TECHNOLOGIES, 2021, : 269 - 273
  • [8] CLASSIFIER FUSION FOR EMOTION RECOGNITION FROM SPEECH
    Scherer, Stefan
    Schwenker, Friedhelm
    Palm, Guenther
    ADVANCED INTELLIGENT ENVIRONMENTS, 2009, : 95 - 117
  • [9] Speech emotion recognition using kernel sparse representation based classifier
    Sharma, Pulkit
    Abrol, Vinayak
    Sachdev, Abhijeet
    Dileep, A. D.
    Sao, Anil Kumar
    2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 374 - 377
  • [10] A novel classifier based on Enhanced Lipschitz Embedding for speech emotion recognition
    You, Mingyu
    Li, Guo-Zheng
    Chen, Luonan
    Tao, Jianhua
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, PROCEEDINGS: WITH ASPECTS OF THEORETICAL AND METHODOLOGICAL ISSUES, 2008, 5226 : 482 - +