Context-aware text classification system to improve the quality of text: A detailed investigation and techniques

被引:4
|
作者
Saleem, Zeeshan [1 ]
Alhudhaif, Adi [2 ]
Qureshi, Kashif Naseer [1 ]
Jeon, Gwanggil [3 ]
机构
[1] Bahria Univ, Dept Comp Sci, Islamabad, Pakistan
[2] Prince Sattam bin Abdulaziz Univ, Coll Comp Engn & Sci Al Kharj, Dept Comp Sci, Al Kharj, Saudi Arabia
[3] Incheon Natl Univ, Dept Embedded Syst Engn, Incheon, South Korea
来源
关键词
accuracy; algorithm; classification; context-aware; data mining; dataset; methods; computer;
D O I
10.1002/cpe.6489
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Text classification is one of the most important tasks to extract information from the Internet and identifying the best text representation settings. With the increase of data volume on the world wide web, the significance of text classification increases. This situation requires huge human efforts to understand and classify the digital data available on the Internet. Text classification is classifying the number of text files into different classes. The data or text available on the Internet is in an unstructured form which increases the difficulty to understand and classify it for useful purposes. This paper proposes a context-aware text classification system to improve text quality. We use a content-aware recommendation system to extract the data from well-known news databases. Text preprocessing techniques like tokenization, stemming, and stop words removal are studied in detail. Furthermore, unigram, bigram, and trigram attributes are also being tested. Attribute selection methods are also examined and their impact on the text classification results. To carry out a detailed investigation, 11 versions are created of each dataset to save the time in experimentation process and applied the different preprocessing techniques to understand the impact of each technique on classification results. The proposed system is compared with the existing approach to check the accuracy where the proposed system achieved better performance.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Context-Aware Entity Disambiguation in Text Using Markov Chains
    Zhang, Lei
    Rettinger, Achim
    Philipp, Patrick
    2016 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2016), 2016, : 49 - 56
  • [22] A Context-Aware Matching System to Improve User-Perceived Quality
    Kim, Dongchil
    Park, Jiwoo
    Kum, Seung Woo
    Chung, Kwangsue
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2015, 61 (04) : 531 - 538
  • [23] Using ontological modeling in a context-aware summarization system to adapt text for mobile devices
    Fortes Garcia, Luis Fernando
    de Lima, Jose Valdeni
    Loh, Stanley
    Moreira de Oliveira, Jose Palazzo
    ACTIVE CONCEPTUAL MODELING OF LEARNING: NEXT GENERATION LEARNING-BASE SYSTEM DEVELOPMENT, 2007, 4512 : 144 - +
  • [24] ConPhrase: Enhancing Context-Aware Phrase Mining From Text Corpora
    Zhang, Xue
    Li, Qinghua
    Li, Cuiping
    Chen, Hong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (07) : 6767 - 6783
  • [25] CDText: Scene text detector based on context-aware deformable transformer
    Wu, Yirui
    Kong, Qiran
    Yong, Lai
    Narducci, Fabio
    Wan, Shaohua
    PATTERN RECOGNITION LETTERS, 2023, 172 : 8 - 14
  • [26] Context-Aware Gestures for Mixed-Initiative Text Editing UIs
    Leiva, Luis A.
    Alabau, Vicent
    Romero, Veronica
    Toselli, Alejandro H.
    Vidal, Enrique
    INTERACTING WITH COMPUTERS, 2015, 27 (06) : 675 - 696
  • [27] Context-Aware Text-Based Binary mage Stylization and Synthesis
    Yang, Shuai
    Liu, Jiaying
    Yang, Wenhan
    Guo, Zongming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (02) : 952 - 964
  • [28] A Context-Aware Variational Auto-Encoder Model for Text Generation
    Ma, Zhiqiang
    Wang, Chunyu
    Shen, Ji
    Du, Baoxiang
    2020 IEEE INTL SYMP ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, INTL CONF ON BIG DATA & CLOUD COMPUTING, INTL SYMP SOCIAL COMPUTING & NETWORKING, INTL CONF ON SUSTAINABLE COMPUTING & COMMUNICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2020), 2020, : 1176 - 1182
  • [29] An Evolutionary Context-aware Sequential Model for topic evolution of text stream
    Lu, Ziyu
    Tan, Haihui
    Li, Wenjie
    INFORMATION SCIENCES, 2019, 473 (166-177) : 166 - 177
  • [30] Short text similarity measurement using context-aware weighted biterms
    Yang Shuiqiao
    Huang Guangyan
    Ofoghi, Bahadorreza
    Yearwood, John
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (08):