Inductive Model Generation for Text Classification Using a Bipartite Heterogeneous Network

被引:17
|
作者
Rossi, Rafael Geraldeli [1 ]
Lopes, Alneu de Andrade [1 ]
Faleiros, Thiago de Paulo [1 ]
Rezende, Solange Oliveira [1 ]
机构
[1] Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
heterogeneous network; text classification; inductive model generation; CLASSIFIERS;
D O I
10.1007/s11390-014-1436-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Algorithms for numeric data classification have been applied for text classification. Usually the vector space model is used to represent text collections. The characteristics of this representation such as sparsity and high dimensionality sometimes impair the quality of general-purpose classifiers. Networks can be used to represent text collections, avoiding the high sparsity and allowing to model relationships among different objects that compose a text collection. Such network-based representations can improve the quality of the classification results. One of the simplest ways to represent textual collections by a network is through a bipartite heterogeneous network, which is composed of objects that represent the documents connected to objects that represent the terms. Heterogeneous bipartite networks do not require computation of similarities or relations among the objects and can be used to model any type of text collection. Due to the advantages of representing text collections through bipartite heterogeneous networks, in this article we present a text classifier which builds a classification model using the structure of a bipartite heterogeneous network. Such an algorithm, referred to as IMBHN (Inductive Model Based on Bipartite Heterogeneous Network), induces a classification model assigning weights to objects that represent the terms for each class of the text collection. An empirical evaluation using a large amount of text collections from different domains shows that the proposed IMBHN algorithm produces significantly better results than k-NN, C4.5, SVM, and Naive Bayes algorithms.
引用
收藏
页码:361 / 375
页数:15
相关论文
共 50 条
  • [31] A Word-Concept Heterogeneous Graph Convolutional Network for Short Text Classification
    Yang, Shigang
    Liu, Yongguo
    Zhang, Yun
    Zhu, Jiajing
    NEURAL PROCESSING LETTERS, 2023, 55 (01) : 735 - 750
  • [32] A Word-Concept Heterogeneous Graph Convolutional Network for Short Text Classification
    Shigang Yang
    Yongguo Liu
    Yun Zhang
    Jiajing Zhu
    Neural Processing Letters, 2023, 55 : 735 - 750
  • [33] Heterogeneous Graph-Convolution-Network-Based Short-Text Classification
    Hua, Jiwei
    Sun, Debing
    Hu, Yanxiang
    Wang, Jiayu
    Feng, Shuquan
    Wang, Zhaoyang
    APPLIED SCIENCES-BASEL, 2024, 14 (06):
  • [34] Automatic text classification using an artificial neural network
    de Mello, RF
    Senger, LJ
    Yang, LT
    HIGH PERFORMANCE COMPUTATIONAL SCIENCE AND ENGINEERING, 2004, 172 : 215 - +
  • [35] Inductive and Example-Based Learning for Text Classification
    Wang, Ye-Yi
    Li, Xiao
    Acero, Alex
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1610 - 1613
  • [36] Short Text Topic Learning Using Heterogeneous Information Network
    Wang, Qingren
    Zhu, Chengcheng
    Zhang, Yiwen
    Zhong, Hong
    Zhong, Jinqin
    Sheng, Victor S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 5269 - 5281
  • [37] Towards One-Shot Learning for Text Classification using Inductive Logic Programming
    Milani, Ghazal Afroozi
    Cyrus, Daniel
    Tamaddoni-Nezhad, Alireza
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2023, (385): : 69 - 79
  • [38] HCapsNet: A Text Classification Model Based on Hierarchical Capsule Network
    Li, Ying
    Ye, Ming
    Hu, Qian
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2021, PT II, 2021, 12816 : 538 - 549
  • [39] Study a Text Classification Method Based on Neural Network Model
    Chen, Jian
    Pan, Hailan
    Ao, Qinyun
    ADVANCES IN MULTIMEDIA, SOFTWARE ENGINEERING AND COMPUTING, VOL 1, 2011, 128 : 471 - 475
  • [40] Text Classification Based on Convolutional Neural Network and Attention Model
    Yang, Shuang
    Tang, Yan
    2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2020), 2020, : 67 - 73