Inductive Model Generation for Text Classification Using a Bipartite Heterogeneous Network

被引:17
|
作者
Rossi, Rafael Geraldeli [1 ]
Lopes, Alneu de Andrade [1 ]
Faleiros, Thiago de Paulo [1 ]
Rezende, Solange Oliveira [1 ]
机构
[1] Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
heterogeneous network; text classification; inductive model generation; CLASSIFIERS;
D O I
10.1007/s11390-014-1436-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Algorithms for numeric data classification have been applied for text classification. Usually the vector space model is used to represent text collections. The characteristics of this representation such as sparsity and high dimensionality sometimes impair the quality of general-purpose classifiers. Networks can be used to represent text collections, avoiding the high sparsity and allowing to model relationships among different objects that compose a text collection. Such network-based representations can improve the quality of the classification results. One of the simplest ways to represent textual collections by a network is through a bipartite heterogeneous network, which is composed of objects that represent the documents connected to objects that represent the terms. Heterogeneous bipartite networks do not require computation of similarities or relations among the objects and can be used to model any type of text collection. Due to the advantages of representing text collections through bipartite heterogeneous networks, in this article we present a text classifier which builds a classification model using the structure of a bipartite heterogeneous network. Such an algorithm, referred to as IMBHN (Inductive Model Based on Bipartite Heterogeneous Network), induces a classification model assigning weights to objects that represent the terms for each class of the text collection. An empirical evaluation using a large amount of text collections from different domains shows that the proposed IMBHN algorithm produces significantly better results than k-NN, C4.5, SVM, and Naive Bayes algorithms.
引用
收藏
页码:361 / 375
页数:15
相关论文
共 50 条
  • [41] Multi-dimensional LSTM: A Model of Network Text Classification
    Wu, Weixin
    Liu, Xiaotong
    Shi, Leyi
    Liu, Yihao
    Song, Yuxiao
    WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2021, PT III, 2021, 12939 : 209 - 217
  • [42] Text Classification of Network Pyramid Scheme based on Topic Model
    Mu, Pengyu
    He, Jingsha
    Zhu, Nafei
    NLPIR 2019: 2019 3RD INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, 2019, : 15 - 19
  • [43] An Integration Model Based on Graph Convolutional Network for Text Classification
    Tang, Hengliang
    Mi, Yuan
    Xue, Fei
    Cao, Yang
    IEEE ACCESS, 2020, 8 : 148865 - 148876
  • [44] Text Classification Research Based on Bert Model and Bayesian Network
    Liu, Songsong
    Tao, Haijun
    Feng, Shiling
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 5842 - 5846
  • [45] A Novel Graph Neural Network Based Model for Text Classification
    Xiong, Rui
    Zheng, Hongying
    Wang, Zongbing
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT VII, 2024, 15022 : 64 - 78
  • [46] Tensor residual graph convolutional network model for text classification
    Fan F.
    Lei X.
    Deng X.
    Nie X.
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2023, 51 (02): : 52 - 57
  • [47] Strategies for Molecular Classification of Asthma Using Bipartite Network Analysis of Cytokine Expression
    Pillai, Regina R.
    Divekar, Rohit
    Brasier, Allan
    Bhavnani, Suresh
    Calhoun, William J.
    CURRENT ALLERGY AND ASTHMA REPORTS, 2012, 12 (05) : 388 - 395
  • [48] BERT-TriF: An inductive short text classification model for power equipment defect records
    Ye, Zhenhao
    Guo, Bingyan
    Qi, Donglian
    ENGINEERING REPORTS, 2023, 5 (10)
  • [49] Personal Recommendation Via Heterogeneous Diffusion on Bipartite Network
    Ju, Chunhua
    Xu, Chonghuan
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2014, 23 (03)
  • [50] Social Media Text Generation Based on Neural Network Model
    Cao, Jiarun
    Wang, Chongwen
    PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 58 - 61