Relational Turkish Text Classification Using Distant Supervised Entities and Relations

被引:1
|
作者
Okur, Halil Ibrahim [1 ,2 ]
Tohma, Kadir [1 ]
Sertbas, Ahmet [2 ]
机构
[1] Iskenderun Tech Univ, Fac Engn & Nat Sci, Dept Comp Engn, TR-31200 Hatay, Turkiye
[2] Istanbul Univ Cerrahpasa, Fac Engn, Dept Comp Engn, TR-34310 Istanbul, Turkiye
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2024年 / 79卷 / 02期
关键词
Text classification; relation extraction; NER; distant supervision; deep learning; machine learning; MODEL;
D O I
10.32604/cmc.2024.050585
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text classification, by automatically categorizing texts, is one of the foundational elements of natural language processing applications. This study investigates how text classification performance can be improved through the integration of entity-relation information obtained from the Wikidata (Wikipedia database) database and BERTbased pre-trained Named Entity Recognition (NER) models. Focusing on a significant challenge in the field of natural language processing (NLP), the research evaluates the potential of using entity and relational information to extract deeper meaning from texts. The adopted methodology encompasses a comprehensive approach that includes text preprocessing, entity detection, and the integration of relational information. Experiments conducted on text datasets in both Turkish and English assess the performance of various classification algorithms, such as Support Vector Machine, Logistic Regression, Deep Neural Network, and Convolutional Neural Network. The results indicate that the integration of entity-relation information can significantly enhance algorithm performance in text classification tasks and offer new perspectives for information extraction and semantic analysis in NLP applications. Contributions of this work include the utilization of distant supervised entity-relation information in Turkish text classification, the development of a Turkish relational text classification approach, and the creation of a relational database. By demonstrating potential performance improvements through the integration of distant supervised entity-relation information into Turkish text classification, this research aims to support the effectiveness of text-based artificial intelligence (AI) tools. Additionally, it makes significant contributions to the development of multilingual text classification systems by adding deeper meaning to text content, thereby providing a valuable addition to current NLP studies and setting an important reference point for future research.
引用
收藏
页码:2209 / 2228
页数:20
相关论文
共 50 条
  • [31] Automatic Bug Triage using Semi-Supervised Text Classification
    Xuan, Jifeng
    Jiang, He
    Ren, Zhilei
    Yan, Jun
    Luo, Zhongxuan
    22ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING & KNOWLEDGE ENGINEERING (SEKE 2010), 2010, : 209 - 214
  • [32] Semi-supervised text classification using positive and unlabeled data
    Yu, Shuang
    Zhou, Xueyuan
    Li, Chunping
    ADVANCES IN INTELLIGENT IT: ACTIVE MEDIA TECHNOLOGY 2006, 2006, 138 : 249 - 254
  • [33] TESC: An approach to TExt classification using Semi-supervised Clustering
    Zhang, Wen
    Tang, Xijin
    Yoshida, Taketoshi
    KNOWLEDGE-BASED SYSTEMS, 2015, 75 : 152 - 160
  • [34] Medical Entities Tagging Using Distant Learning
    Vivaldi, Jorge
    Rodriguez, Horacio
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT II, 2015, 9042 : 631 - 642
  • [35] Text Classification using Semi-supervised Approach for Multi Domain
    Deshmukh, Jyoti S.
    Tripathy, Amiya Kumar
    2017 INTERNATIONAL CONFERENCE ON NASCENT TECHNOLOGIES IN ENGINEERING (ICNTE-2017), 2017,
  • [36] A Framework for Extraction of Relations from Text using Relational Learning and Similarity Measures
    Vargas-Vera, Maria
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2015, 21 (11) : 1482 - 1495
  • [37] Distant relations: limits to relational contracting in domestic violence programmes
    Carson, Ed
    Chung, Donna
    Day, Andrew
    INTERNATIONAL JOURNAL OF PUBLIC SECTOR MANAGEMENT, 2012, 25 (02) : 103 - 117
  • [38] Inducing Implicit Relations from Text Using Distantly Supervised Deep Nets
    Glass, Michael
    Gliozzo, Alfio
    Hassanzadeh, Oktie
    Mihindukulasooriya, Nandana
    Rossiello, Gaetano
    SEMANTIC WEB - ISWC 2018, PT I, 2018, 11136 : 38 - 55
  • [39] Mining relational data from text: From strictly supervised to weakly supervised learning
    Zhang, Zhu
    INFORMATION SYSTEMS, 2008, 33 (03) : 300 - 314
  • [40] Multiple relations extraction among multiple entities in unstructured text
    Jin Liu
    Haoliang Ren
    Menglong Wu
    Jin Wang
    Hye-jin Kim
    Soft Computing, 2018, 22 : 4295 - 4305