Inductive Model Generation for Text Classification Using a Bipartite Heterogeneous Network

被引:17
|
作者
Rossi, Rafael Geraldeli [1 ]
Lopes, Alneu de Andrade [1 ]
Faleiros, Thiago de Paulo [1 ]
Rezende, Solange Oliveira [1 ]
机构
[1] Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
heterogeneous network; text classification; inductive model generation; CLASSIFIERS;
D O I
10.1007/s11390-014-1436-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Algorithms for numeric data classification have been applied for text classification. Usually the vector space model is used to represent text collections. The characteristics of this representation such as sparsity and high dimensionality sometimes impair the quality of general-purpose classifiers. Networks can be used to represent text collections, avoiding the high sparsity and allowing to model relationships among different objects that compose a text collection. Such network-based representations can improve the quality of the classification results. One of the simplest ways to represent textual collections by a network is through a bipartite heterogeneous network, which is composed of objects that represent the documents connected to objects that represent the terms. Heterogeneous bipartite networks do not require computation of similarities or relations among the objects and can be used to model any type of text collection. Due to the advantages of representing text collections through bipartite heterogeneous networks, in this article we present a text classifier which builds a classification model using the structure of a bipartite heterogeneous network. Such an algorithm, referred to as IMBHN (Inductive Model Based on Bipartite Heterogeneous Network), induces a classification model assigning weights to objects that represent the terms for each class of the text collection. An empirical evaluation using a large amount of text collections from different domains shows that the proposed IMBHN algorithm produces significantly better results than k-NN, C4.5, SVM, and Naive Bayes algorithms.
引用
收藏
页码:361 / 375
页数:15
相关论文
共 50 条
  • [21] Technology Network Model Using Bipartite Social Network Analysis
    Jun, Sunghae
    COMPUTER APPLICATIONS FOR SOFTWARE ENGINEERING, DISASTER RECOVERY, AND BUSINESS CONTINUITY, 2012, 340 : 28 - 35
  • [22] Text classification on heterogeneous information network via enhanced GCN and knowledge
    Hui Li
    Yan Yan
    Shuo Wang
    Juan Liu
    Yunpeng Cui
    Neural Computing and Applications, 2023, 35 : 14911 - 14927
  • [23] Text classification on heterogeneous information network via enhanced GCN and knowledge
    Li, Hui
    Yan, Yan
    Wang, Shuo
    Liu, Juan
    Cui, Yunpeng
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (20): : 14911 - 14927
  • [24] Cross-lingual Text Classification with Heterogeneous Graph Neural Network
    Wang, Ziyun
    Liu, Xuan
    Yang, Peiji
    Liu, Shixing
    Wang, Zhisheng
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 612 - 620
  • [25] Text classification using the σ-FLNMAP neural network
    Petridis, V
    Kaburlasos, VG
    Fragkou, P
    Kehagias, A
    IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 1362 - 1367
  • [26] An Integrated Deep Generative Model for Text Classification and Generation
    Wang, Zheng
    Wu, Qingbiao
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2018, 2018
  • [27] Evaluation of text-to-gesture generation model using convolutional neural network
    Asakawa, Eiichi
    Kaneko, Naoshi
    Hasegawa, Dai
    Shirakawa, Shinichi
    NEURAL NETWORKS, 2022, 151 : 365 - 375
  • [28] Semi-supervised Coarsening of Bipartite Graphs for Text Classification via Graph Neural Network
    dos Santos, Nicolas Roque
    Minatel, Diego
    Baria Valejo, Alan Demetrius
    Lopes, Alneu de Andrade
    2024 IEEE 11TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, DSAA 2024, 2024, : 157 - 166
  • [29] Text Generation for Imbalanced Text Classification
    Akkaradamrongrat, Suphamongkol
    Kachamas, Pornpimon
    Sinthupinyo, Sukree
    2019 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE 2019), 2019, : 181 - 186
  • [30] A Novel Approach to Model Generation for Heterogeneous Data Classification
    Jin, Rong
    Liu, Huan
    19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 746 - 751