Children's Sentiment Analysis From Texts by Using Weight Updated Tuned With Random Forest Classification

被引:3
|
作者
Ahmed Bilal, Azhar [1 ,2 ]
Ayhan Erdem, O. [3 ]
Toklu, Sinan [3 ]
机构
[1] Gazi Univ, Grad Sch Nat & Appl Sci, Dept Comp Engn, TR-06560 Ankara, Turkiye
[2] Kirkuk Univ, Coll Comp Sci & Informat Technol, Kirkuk 36001, Iraq
[3] Gazi Univ, Fac Technol, Dept Comp Engn, TR-06560 Ankara, Turkiye
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Long short term memory (deep LSTM); natural language processing (NLP); principal component analysis (PCA); sentiment analysis (SA); singular value decomposition (SVD); MODEL;
D O I
10.1109/ACCESS.2024.3400992
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sentimental Analysis is considered a computational strategy that helps in identifying and assessing the emotions of people via text documents. Tools and different methods have been adopted for determining both positive and negative emotions in the form of text data analytics by using Machine and Deep Learning techniques. Experimentally, it has been shown that the accuracy of existing text classification models such as Bi-LSTM, Decision Tree, and Ensemble Classifiers is limited by poor quality data, inappropriate hyperparameter tuning, and model-specific bias levels. Additionally, these models are prone to overfitting, high computational overhead, and longer training time. To overcome these limitations, we proposed a hybrid binary classification framework by combining Deep sequential features with the Random Forest (RF) technique. The approach is implemented in four phases: Initially, data preprocessing is performed by employing a Vader sentiment package. In the second step, the deep Long Short Term Memory (LSTM) model was employed to extract deep sequential features corresponding to sad and happy emotions. In the third phase, a bi-orthogonalization algorithm with principal component Analysis (PCA) and Singular Value Decomposition (SVD) was employed to minimize the redundancy and maximize the relevance of extracted features. Finally, a five-fold cross-validation technique was implemented to discriminate sad and happy emotions using the Random Forest (RF) algorithm. Eventually, a grid search approach was implemented for hyperparameter tuning and results were compared with five baseline algorithms (Vanilla LSTM (VLSTM), Support Vector Machine (SVM), Gradient Boosting Machine (GBM), Naive Bayes (NB), Ada Boost Algorithm (ABA). The experimental outcomes revealed that the proposed model achieved an accuracy rate of 99.631% on the 4000 stories dataset which was superior to all five state-of-the-art methods with a margin of 4.63%, 10.7%, 19.44%, 21%, and 56.5%, respectively. Interestingly, the proposed model realized improved results in terms of other conventional performance metrics also such as precision, recall, specificity, and time complexity. Overall, the proposed model has great potential in educational institutions, child psychology research, and child-friendly content moderation, generally helping in the understanding of the emotions and experiences of children in the digital realm.
引用
收藏
页码:70089 / 70104
页数:16
相关论文
共 50 条
  • [21] Wine Quality Optimization Using Decision Tree Analysis and Random Forest Classification Techniques
    He, Zhiyu
    Yu, Yang
    Trela, Brent
    AMERICAN JOURNAL OF ENOLOGY AND VITICULTURE, 2010, 61 (03): : 434A - 434A
  • [22] China's Public Firms' Attitudes towards Environmental Protection Based on Sentiment Analysis and Random Forest Models
    Li, Cai
    Li, Luyu
    Zheng, Jiaqi
    Wang, Jizhi
    Yuan, Yi
    Lv, Zezhong
    Wei, Yinghao
    Han, Qihang
    Gao, Jiatong
    Liu, Wenhao
    SUSTAINABILITY, 2022, 14 (09)
  • [23] Predicting USCS soil classification from soil property variables using Random Forest
    Gambill, Daniel R.
    Wall, Wade A.
    Fulton, Andrew J.
    Howard, Heidi R.
    JOURNAL OF TERRAMECHANICS, 2016, 65 : 85 - 92
  • [24] Weighted Random Forest using Gaze Distributions Measured from Observers for Gender Classification
    Yamaguchi, Sayaka
    Nishiyama, Masashi
    Iwai, Yoshio
    PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 273 - 280
  • [25] RETRACTION: Sentiment classification using harmony random forest and harmony gradient boosting machine (Retraction of Vol 24, Pg 7451, 2020)
    Sridharan, K.
    Komarasamy, G.
    SOFT COMPUTING, 2023, 27 (02) : 1217 - 1217
  • [26] Breast Cancer Classification with Random Forest Classifier with Feature Decomposition Using Principal Component Analysis
    Chudhey, Arshdeep Singh
    Goel, Mohak
    Singh, Mrityunjay
    ADVANCES IN DATA AND INFORMATION SCIENCES, 2022, 318 : 111 - 120
  • [27] Myocardial Scar Segmentation in LGE-MRI using Fractal Analysis and Random Forest Classification
    Kurzendorfer, Tanja
    Breininger, Katharina
    Steidl, Stefan
    Brost, Alexander
    Forman, Christoph
    Maier, Andreas
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 3168 - 3173
  • [28] Breast Cancer Classification with Random Forest Classifier with Feature Decomposition Using Principal Component Analysis
    Abd Manan, Nur Anis Syarafinaz
    Ahmad, Wan Amiza Amneera Wan
    Sulaiman, Nik Meriam Nik
    Mahmood, Noor Zalina
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON GREEN ENVIRONMENTAL ENGINEERING AND TECHNOLOGY (ICONGEET 2021), 2022, 214 : 385 - 389
  • [29] Potential risk genes for primary Sjogren's syndrome from a meta-analysis by linear regression and random forest classification
    Cerdo, Tomas
    Moral, Teresa Torres
    GENES & DISEASES, 2024, 11 (03)
  • [30] Detecting Forest Fires in Southwest China From Remote Sensing Nighttime Lights Using the Random Forest Classification Model
    Yu, Yuehan
    Liu, Lili
    Chang, Zhijian
    Li, Yuanqing
    Shi, Kaifang
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 10759 - 10769