A Hybrid Attention-Based Transformer Model for Arabic News Classification Using Text Embedding and Deep Learning

被引:0
|
作者
Hossain, Md. Mithun [1 ]
Hossain, Md. Shakil [1 ]
Safran, Mejdl [2 ]
Alfarhood, Sultan [3 ]
Alfarhood, Meshal [3 ]
F. Mridha, M. [4 ]
机构
[1] Bangladesh Univ Business & Technol, Dept Comp Sci & Engn, Dhaka 1216, Bangladesh
[2] King Saud Univ, Coll Comp & Informat Sci, Res Chair Online Dialogue & Cultural Commun, Dept Comp Sci, Riyadh 11543, Saudi Arabia
[3] King Saud Univ, Coll Comp & Informat Sci, Dept Comp Sci, Riyadh 11543, Saudi Arabia
[4] Amer Int Univ Bangladesh, Dept Comp Sci, Dhaka 1229, Bangladesh
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Deep learning; Accuracy; Analytical models; Text categorization; Transformers; Sentiment analysis; Data models; Tokenization; Predictive models; Syntactics; hybrid transformer; Arabic text classifications; Arabic news classifications; SENTIMENT ANALYSIS;
D O I
10.1109/ACCESS.2024.3522061
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Efficient classification of Arabic news items has become more crucial for efficient information management and analysis due to the fast growth of online news material. This paper proposes a hybrid Attention-Based Transformer Model (ABTM) for Arabic news categorization that uses deep learning and classical text representations to improve classification accuracy and interpretability. Given the increasing amount of Arabic news materials, robust categorization systems are crucial for properly managing and analyzing this information. To deal with the complexities of the Arabic language and enrich the dataset, we used a thorough preparation pipeline that includes text cleaning, tokenization, lemmatization, and data augmentation approaches. We combined a bespoke attention embedder with classic TF-IDF and Bag-of-Words features to provide a comprehensive feature set that includes both the text's contextual and statistical aspects. We benchmarked our technique using cutting-edge Arabic language models, such as AraBERTv1-base and asafaya/bert-base-arabic. We use (local interpretable model agnostic explanation) text explainer to offer insights into model predictions, improving our findings' interpretability. Our results show that the ABTM strategy considerably enhances classification performance, with high accuracy and reasonable explanations for model decisions. This classification includes a wide range of news categories, including politics, sports, culture, the economy, and a variety of themes, representing the diversity of Arabic news. This study contributes to the field of Arabic natural language processing by offering a novel method that combines deep learning with traditional techniques, thereby advancing the state of Arabic news classification. Enhanced classification accuracy and interpretability facilitate better management and understanding of the rich and growing Arabic news content, supporting informed decision-making and knowledge discovery.
引用
收藏
页码:198046 / 198066
页数:21
相关论文
共 50 条
  • [1] Attention-Based Deep Learning Model for Arabic Handwritten Text Recognition
    Gader T.B.A.
    Echi A.K.
    Machine Graphics and Vision, 2022, 31 (1-4): : 49 - 73
  • [2] Hybrid deep learning model for Arabic text classification based on mutual information
    Abdulghani, Farah A.
    Abdullah, Nada A. Z.
    JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2022, 43 (08): : 1901 - 1908
  • [3] Attention-based Deep Learning Model for Text Readability Evaluation
    Sun, Yuxuan
    Chen, Keying
    Sun, Lin
    Hu, Chenlu
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [4] Automated Arabic Text Classification Using Hyperparameter Tuned Hybrid Deep Learning Model
    Al-onazi, Badriyya B.
    Alotaib, Saud S.
    Alshahrani, Saeed Masoud
    Alotaibi, Najm
    Alnfiai, Mrim M.
    Salama, Ahmed S.
    Hamza, Manar Ahmed
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (03): : 5447 - 5465
  • [5] On Exploring Attention-based Explanation for Transformer Models in Text Classification
    Liu, Shengzhong
    Le, Franck
    Chakraborty, Supriyo
    Abdelzaher, Tarek
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 1193 - 1203
  • [6] CRAN: A Hybrid CNN-RNN Attention-Based Model for Text Classification
    Guo, Long
    Zhang, Dongxiang
    Wang, Lei
    Wang, Han
    Cui, Bin
    CONCEPTUAL MODELING, ER 2018, 2018, 11157 : 571 - 585
  • [7] HADLN: Hybrid Attention-Based Deep Learning Network for Automated Arrhythmia Classification
    Jiang, Mingfeng
    Gu, Jiayan
    Li, Yang
    Wei, Bo
    Zhang, Jucheng
    Wang, Zhikang
    Xia, Ling
    FRONTIERS IN PHYSIOLOGY, 2021, 12
  • [8] An attention-based hybrid deep learning model for EEG emotion recognition
    Yong Zhang
    Yidie Zhang
    Shuai Wang
    Signal, Image and Video Processing, 2023, 17 : 2305 - 2313
  • [9] An attention-based hybrid deep learning model for EEG emotion recognition
    Zhang, Yong
    Zhang, Yidie
    Wang, Shuai
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (05) : 2305 - 2313
  • [10] A Hybrid Deep Learning Model for Arabic Text Recognition
    Fasha, Mohammad
    Hammo, Bassam
    Obeid, Nadim
    AlWidian, Jabir
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (08) : 122 - 130