TOWARD ROBUST SPEECH EMOTION RECOGNITION AND CLASSIFICATION USING NATURAL LANGUAGE PROCESSING WITH DEEP LEARNING MODEL

被引：0

作者：

Alahmari, Saad ^{[1
]}

Al-shathry, Najla i. ^{[2
]}

Eltahir, Majdy m. ^{[3
]}

Alzaidi, Muhammad swaileh a. ^{[4
]}

Alghamdi, Ayman ahmad ^{[5
]}

Mahmud, Ahmed ^{[6
]}

机构：

[1] Northern Border Univ, Appl Coll, Dept Comp Sci, Ar Ar, Saudi Arabia

[2] Princess Nourah Bint Abdulrahman Univ, Arab Language Teaching Inst, Dept Language Preparat, POB 84428, Riyadh 11671, Saudi Arabia

[3] King Khalid Univ, Appl Coll Mahayil, Dept Informat Syst, Abha, Saudi Arabia

[4] King Saud Univ, Coll Language Sci, Dept English Language, POB 145111, Riyadh, Saudi Arabia

[5] Umm Al qura Univ, Arab Language Inst, Dept Arab Teaching, Mecca, Saudi Arabia

[6] Future Univ Egypt, Res Ctr, New Cairo 11835, Egypt

来源：

FRACTALS-COMPLEX GEOMETRY PATTERNS AND SCALING IN NATURE AND SOCIETY | 2025年

关键词：

Speech Emotion Recognition; Deep Learning; Fractal Seagull Optimization Algorithm; Feature Extraction;

D O I：

10.1142/S0218348X25400225

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Speech Emotion Recognition (SER) plays a significant role in human-machine interaction applications. Over the last decade, many SER systems have been anticipated. However, the performance of the SER system remains a challenge owing to the noise, high system complexity and ineffective feature discrimination. SER is challenging and vital, and feature extraction is critical in SER performance. Deep Learning (DL)-based techniques emerge as proficient solutions for SER due to their competence in learning unlabeled data, superior capability of feature representation, capability to handle larger datasets and ability to handle complex features. Different DL techniques, like Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), Deep Neural Network (DNN) and so on, are successfully presented for automated SER. The study proposes a Robust SER and Classification using the Natural Language Processing with DL (RSERC-NLPDL) model. The presented RSERC-NLPDL technique intends to identify the emotions in the speech signals. In the RSERC-NLPDL technique, pre-processing is initially performed to transform the input speech signal into a valid format. Besides, the RSERC-NLPDL technique extracts a set of features comprising Mel-Frequency Cepstral Coefficients (MFCCs), Zero-Crossing Rate (ZCR), Harmonic-to-Noise Rate (HNR) and Teager Energy Operator (TEO). Next, selecting features can be carried out using Fractal Seagull Optimization Algorithm (FSOA). The Temporal Convolutional Autoencoder (TCAE) model is applied to identify speech emotions, and its hyperparameters are selected using fractal Sand Cat Swarm Optimization (SCSO) algorithm. The simulation analysis of the RSERC-NLPDL method is tested using a speech database. The investigational analysis of the RSERC-NLPDL technique showed superior accuracy outcomes of 94.32% and 95.25% under EMODB and RAVDESS datasets over other models in distinct measures.

引用

页数：15

共 50 条

[1] A Robust Deep Transfer Learning Model for Accurate Speech Emotion Classification
Akinpelu, Samson
Viriri, Serestina
ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT II, 2022, 13599 : 419 - 430
[2] MULTI-CLASS AUTOMATED SPEECH LANGUAGE RECOGNITION USING NATURAL LANGUAGE PROCESSING WITH OPTIMAL DEEP LEARNING MODEL
Al-anazi, Reema g.
Alqahtani, Hamed
Alzaidi, Muhammad swaileh a.
Alanazi, Meshari h.
AL Sultan, Hanan
Alrowaily, Amal f.
Aljabri, Jawhara
Alqudah, Assal
FRACTALS-COMPLEX GEOMETRY PATTERNS AND SCALING IN NATURE AND SOCIETY, 2025,
[3] Speech Emotion Recognition Using Deep Learning LSTM for Tamil Language
Fernandes, Bennilo
Mannepalli, Kasiprasad
PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY, 2021, 29 (03): : 1915 - 1936
[4] Speech Emotion Recognition Using Deep Learning
Alagusundari, N.
Anuradha, R.
ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, VOL 1, AITA 2023, 2024, 843 : 313 - 325
[5] Speech Emotion Recognition Using Deep Learning
Ahmed, Waqar
Riaz, Sana
Iftikhar, Khunsa
Konur, Savas
ARTIFICIAL INTELLIGENCE XL, AI 2023, 2023, 14381 : 191 - 197
[6] Speech Emotion Classification Using Deep Learning
Mishra, Siba Prasad
Warule, Pankaj
Deb, Suman
PROCEEDINGS OF 27TH INTERNATIONAL SYMPOSIUM ON FRONTIERS OF RESEARCH IN SPEECH AND MUSIC, FRSM 2023, 2024, 1455 : 19 - 31
[7] Speech Based Multiple Emotion Classification Model Using Deep Learning
Patneedi, Shakti Swaroop
Kumari, Nandini
ADVANCES IN COMPUTING AND DATA SCIENCES, PT I, 2021, 1440 : 648 - 659
[8] Speech emotion recognition and classification using hybrid deep CNN and BiLSTM model
Swami Mishra
Nehal Bhatnagar
Prakasam P
Sureshkumar T. R
Multimedia Tools and Applications, 2024, 83 : 37603 - 37620
[9] Speech emotion recognition and classification using hybrid deep CNN and BiLSTM model
Mishra, Swami
Bhatnagar, Nehal
Prakasam, P.
Sureshkumar, T. R.
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (13) : 37603 - 37620
[10] Modeling a Novel Approach for Emotion Recognition Using Learning and Natural Language Processing
Lalitha, Lakshmi, V
Anguraj, Dinesh Kumar
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (03)

← 1 2 3 4 5 →