A Novel Optimized Recurrent Network-Based Automatic System for Speech Emotion Identification

被引：5

作者：

Koppula, Neeraja ^{[1
]}

Rao, Koppula Srinivas ^{[2
]}

Nabi, Shaik Abdul ^{[3
]}

Balaram, Allam ^{[4
]}

机构：

[1] Geetanjali Coll Engn & Technol, Dept Comp Sci & Engn, Hyderabad 501301, Telangana, India

[2] MLR Inst Technol, Dept Comp Sci & Engn, Hyderabad 500043, Telangana, India

[3] Sreyas Inst Engn & Technol, Dept Comp Sci & Engn, Hyderabad 500068, Telangana, India

[4] MLR Inst Technol, Dept Informat Technol, Hyderabad 500043, Telangana, India

来源：

WIRELESS PERSONAL COMMUNICATIONS | 2023年 / 128卷 / 03期

关键词：

Firefly algorithm; Recurrent neural network; Speech recognition; Speech emotion identification; Speech signal; RECOGNITION;

D O I：

10.1007/s11277-022-10040-5

中图分类号：

TN [电子技术、通信技术];

学科分类号：

0809 ;

摘要：

Speech is a unique characteristic of humans that expresses one's emotional viewpoint to others. Speech emotion recognition (SER) identifies the speaker's emotion from the speech signal. Nowadays, (SER) plays a vital role in real-time applications such as human-machine interface, lie detection, virtual reality, security, audio mining, etc. But in SER, filtering the noise content and extracting the emotional features is complex. Moreover, incorporating digital filters increases the cost and complexity of the system. Thus, a novel hybrid firefly-based recurrent neural speech recognition (FbRNSR) was developed with preprocessing and a feature analysis module to classify human emotions based on the speech input. The extracted features from the feature extraction module are trained to classify the emotions as happy, sad, or average. Moreover, the incorporation of firefly fitness improves the classification rate. The presented model is executed in Python, and the results are estimated. The performance of the presented approach is analyzed using the confusion matrix. The designed model achieved high true positive rate of 99.34%, true negative rate of 99.12%, false positive of 99.21%, and false negative rate of 99.07%. The designed model achieved 99.2% accuracy, 98.9% recall, and precision value for the speech signal dataset. Finally, the effectiveness and robustness of the proposed approach are proved by comparing it with the existing techniques. Hence, this method is applicable in various sectors such as medicine, security, etc., to identify the state of emotions among the people.

引用

页码：2217 / 2243

页数：27

共 50 条

[1] A Novel Optimized Recurrent Network-Based Automatic System for Speech Emotion Identification
Neeraja Koppula
Koppula Srinivas Rao
Shaik Abdul Nabi
Allam Balaram
Wireless Personal Communications, 2023, 128 : 2217 - 2243
[2] A Recurrent Neural Network-Based Approach to Automatic Language Identification from Speech
Mukherjee, Himadri
Dhar, Ankita
Obaidullah, Sk Md
Santosh, K. C.
Phadikar, Santanu
Roy, Kaushik
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, DEVICES AND COMPUTING, 2020, 602 : 441 - 450
[3] Recurrent Neural Network-based Language Modeling for an Automatic Russian Speech Recognition System
Kipyatkova, Irina
Karpov, Alexey
2015 ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE AND INFORMATION EXTRACTION, SOCIAL MEDIA AND WEB SEARCH FRUCT CONFERENCE (AINL-ISMW FRUCT), 2015, : 33 - 38
[4] A Novel deep neural network-based emotion analysis system for automatic detection of mild cognitive impairment in the elderly
Fei, Zixiang
Yang, Erfu
Yu, Leijian
Li, Xia
Zhou, Huiyu
Zhou, Wenju
NEUROCOMPUTING, 2022, 468 : 306 - 316
[5] Deep Neural Network-based Speech Separation Combining with MVDR Beamformer for Automatic Speech Recognition System
Lee, Bong-Ki
Jeong, Jaewoong
2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2019,
[6] The Asian Network-based Speech-to-Speech Translation System
Sakti, Sakriani
Kimura, Noriyuki
Paul, Michael
Hori, Chiori
Sumita, Eiichiro
Nakamura, Satoshi
Park, Jun
Wutiwiwatchai, Chai
Xu, Bo
Riza, Hammam
Arora, Karunesh
Luong, Chi Mai
Li, Haizhou
2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 507 - +
[7] Neural network-based blended ensemble learning for speech emotion recognition
Yalamanchili, Bhanusree
Samayamantula, Srinivas Kumar
Anne, Koteswara Rao
MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 2022, 33 (04) : 1323 - 1348
[8] Neural network-based blended ensemble learning for speech emotion recognition
Bhanusree Yalamanchili
Srinivas Kumar Samayamantula
Koteswara Rao Anne
Multidimensional Systems and Signal Processing, 2022, 33 : 1323 - 1348
[9] A novel recurrent neural network-based prediction system for option trading and hedging
Quek, C.
Pasquier, M.
Kumar, N.
APPLIED INTELLIGENCE, 2008, 29 (02) : 138 - 151
[10] A novel recurrent neural network-based prediction system for option trading and hedging
C. Quek
M. Pasquier
N. Kumar
Applied Intelligence, 2008, 29 : 138 - 151

← 1 2 3 4 5 →