Exploring deep learning approaches for Urdu text classification in product manufacturing

被引:34
|
作者
Akhter, Muhammad Pervez [1 ]
Jiangbin, Zheng [1 ]
Naqvi, Irfan Raza [1 ]
Abdelmajeed, Mohammed [2 ]
Fayyaz, Muhammad [3 ]
机构
[1] Northwestern Polytech Univ, Sch Software & Microelect, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ, Sch Comp Sci & Technol, Xian, Peoples R China
[3] COMSATS Univ Islamabad, Dept Comp Sci, Wah Campus, Wah Cantt, Pakistan
基金
中国国家自然科学基金;
关键词
Text classification; deep learning; convolutional neural network; long short-term memory; text mining; machine learning; LSTM;
D O I
10.1080/17517575.2020.1755455
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
From last decade, machine learning (ML) techniques have been used for Urdu text processing. Due to lack of language resources, potential of deep learning (DL) models have not been exploited yet for Urdu text document classification. A text document has more noise, redundant information, and large vocabulary than short text like tweets. This study is the systematic comparison of four well-known DL models. We also compare DL models with four ML models. We also explore the various text preprocessing techniques. Experimental results show that CNN outperforms the others. Further, single-layer architecture of LSTM and BiLSTM performs better than multiple-layers architecture.
引用
收藏
页码:223 / 248
页数:26
相关论文
共 50 条
  • [1] Hybrid Machine Learning and Deep Learning Approaches for Insult Detection in Roman Urdu Text
    Hussain, Nisar
    Qasim, Amna
    Mehak, Gull
    Kolesnikova, Olga
    Gelbukh, Alexander
    Sidorov, Grigori
    AI, 2025, 6 (02)
  • [2] Contextual Urdu Text Emotion Detection Corpus and Experiments using Deep Learning Approaches
    Vardag, Muhammad Hamayon Khan
    Saeed, Ali
    Hayat, Umer
    Ullah, Muhammad Farhat
    Hussain, Naveed
    ADCAIJ-ADVANCES IN DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE JOURNAL, 2022, 11 (04): : 489 - 505
  • [3] Deep Learning for Reddit Text Classification: TextCNN and TextRNN Approaches
    Long, Qiyu
    Wang, Zhichen
    Yu, Hao
    4TH INTERDISCIPLINARY CONFERENCE ON ELECTRICS AND COMPUTER, INTCEC 2024, 2024,
  • [4] Benchmarking performance of machine and deep learning-based methodologies for Urdu text document classification
    Muhammad Nabeel Asim
    Muhammad Usman Ghani
    Muhammad Ali Ibrahim
    Waqar Mahmood
    Andreas Dengel
    Sheraz Ahmed
    Neural Computing and Applications, 2021, 33 : 5437 - 5469
  • [5] Benchmarking performance of machine and deep learning-based methodologies for Urdu text document classification
    Asim, Muhammad Nabeel
    Ghani, Muhammad Usman
    Ibrahim, Muhammad Ali
    Mahmood, Waqar
    Dengel, Andreas
    Ahmed, Sheraz
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (11): : 5437 - 5469
  • [6] Exploring Deep Learning Approaches for Walnut Phenotype Variety Classification
    Yilmaz, Burak
    INTERNATIONAL JOURNAL OF FOOD SCIENCE, 2025, 2025 (01)
  • [7] Correction to: Benchmarking performance of machine and deep learning-based methodologies for Urdu text document classification
    Muhammad Nabeel Asim
    Muhammad Usman Ghani
    Muhammad Ali Ibrahim
    Waqar Mahmood
    Andreas Dengel
    Sheraz Ahmed
    Neural Computing and Applications, 2021, 33 : 2157 - 2157
  • [8] Comparative Study between Traditional Machine Learning and Deep Learning Approaches for Text Classification
    Kamath, Cannannore Nidhi
    Bukhari, Syed Saqib
    Dengel, Andreas
    PROCEEDINGS OF THE ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG 2018), 2018,
  • [9] EnML: Multi-label Ensemble Learning for Urdu Text Classification
    Mehmood, Faiza
    Shahzadi, Rehab
    Ghafoor, Hina
    Asim, Muhammad Nabeel
    Ghani, Muhammad Usman
    Mahmood, Waqar
    Dengel, Andreas
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (09)
  • [10] UTSA: Urdu Text Sentiment Analysis Using Deep Learning Methods
    Naqvi, Uzma
    Majid, Abdul
    Abbas, Syed Ali
    IEEE Access, 2021, 9 : 114085 - 114094