Context-aware text classification system to improve the quality of text: A detailed investigation and techniques

被引:4
|
作者
Saleem, Zeeshan [1 ]
Alhudhaif, Adi [2 ]
Qureshi, Kashif Naseer [1 ]
Jeon, Gwanggil [3 ]
机构
[1] Bahria Univ, Dept Comp Sci, Islamabad, Pakistan
[2] Prince Sattam bin Abdulaziz Univ, Coll Comp Engn & Sci Al Kharj, Dept Comp Sci, Al Kharj, Saudi Arabia
[3] Incheon Natl Univ, Dept Embedded Syst Engn, Incheon, South Korea
来源
关键词
accuracy; algorithm; classification; context-aware; data mining; dataset; methods; computer;
D O I
10.1002/cpe.6489
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Text classification is one of the most important tasks to extract information from the Internet and identifying the best text representation settings. With the increase of data volume on the world wide web, the significance of text classification increases. This situation requires huge human efforts to understand and classify the digital data available on the Internet. Text classification is classifying the number of text files into different classes. The data or text available on the Internet is in an unstructured form which increases the difficulty to understand and classify it for useful purposes. This paper proposes a context-aware text classification system to improve text quality. We use a content-aware recommendation system to extract the data from well-known news databases. Text preprocessing techniques like tokenization, stemming, and stop words removal are studied in detail. Furthermore, unigram, bigram, and trigram attributes are also being tested. Attribute selection methods are also examined and their impact on the text classification results. To carry out a detailed investigation, 11 versions are created of each dataset to save the time in experimentation process and applied the different preprocessing techniques to understand the impact of each technique on classification results. The proposed system is compared with the existing approach to check the accuracy where the proposed system achieved better performance.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Using IR techniques to improve automated text classification
    Gonçalves, T
    Quaresma, P
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2004, 3136 : 374 - 379
  • [42] Context-Aware Sentiment Classification
    Kasthuriarachchy, Buddhika H.
    de Zoysa, Kasun
    Premarathne, H. L.
    2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer), 2015, : 276 - 276
  • [43] Context-Aware Query Classification
    Cao, Huanhuan
    Hu, Derek Hao
    Shen, Dou
    Jiang, Daxin
    Sun, Jian-Tao
    Chen, Enhong
    Yang, Qiang
    PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 3 - 10
  • [44] CONTEXT-AWARE WEARABLE SYSTEM FOR ANIMALS - AN EXPLORATION AND CLASSIFICATION
    Irshad, Sana
    Ahsan, Kamran
    Khan, Muhammad Abid
    Iqbal, Sarwat
    Hussain, Muhammad Azhar
    Shafiq, Farhan
    Emad, Shah Muhammad
    INTERNATIONAL JOURNAL ON INFORMATION TECHNOLOGIES AND SECURITY, 2021, 13 (03): : 3 - 14
  • [45] Context-aware relation enhancement and similarity reasoning for image-text retrieval
    Cui, Zheng
    Hu, Yongli
    Sun, Yanfeng
    Yin, Baocai
    IET COMPUTER VISION, 2024, 18 (05) : 652 - 665
  • [46] CBR Assisted Context-Aware Surface Realisation for Data-to-Text Generation
    Upadhyay, Ashish
    Massie, Stewart
    CASE-BASED REASONING RESEARCH AND DEVELOPMENT, ICCBR 2023, 2023, 14141 : 34 - 49
  • [47] A new context-aware measure for semantic distance using a taxonomy and a text corpus
    El Sayed, Ahmad
    Hacid, Hakim
    Zighed, Djamel
    IRI 2007: PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2007, : 279 - +
  • [48] TextSafety: Visual Text Vanishing via Hierarchical Context-Aware Interaction Reconstruction
    Dai, Pengwen
    Li, Jingyu
    Wu, Dayan
    Zheng, Peijia
    Cao, Xiaochun
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 1421 - 1433
  • [49] Data Scarcity: Methods to Improve the Quality of Text Classification
    Glaser, Ingo
    Sadegharmaki, Shabnam
    Komboz, Basil
    Matthes, Florian
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM), 2021, : 556 - 564
  • [50] Investigation of Context-aware System Using Activity Recognition
    Watanabe, Yuki
    Suzumura, Reiji
    Matsuno, Shogo
    Ohyama, Minoru
    2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (ICAIIC 2019), 2019, : 287 - 291