Experiments with hierarchical text classification

被引:0
|
作者
Granitzer, M [1 ]
Auer, P [1 ]
机构
[1] Know Ctr, Div Knowledge Discovery, A-8010 Graz, Austria
关键词
machine learning; supervised learning; hierarchical text classification; boosting; ranking performance;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper applies Boosting to hierarchical text classification where the hierarchical structure is given as directed acyclic graph and compares the results to Support Vector Machines. Hierarchical classification is performed top-down and in each node a flat classifier decides if a document should be further propagated or not. As flat classifiers BoosTexter, CentroidBooster and Support Vector Machines are used, were CentroidBooster is an AdaBoost.MH based alternative similar to BoosTexter. Experiments on the Reuters Corpus Volume 1 and the OHSUMED data set show that the F-1-measure increases if the hierarchal structure of a data set is taken into account. Regarding time complexity we show, that depending on the structure of a hierarchy, learning and classification time can be reduced. Besides these hard classification approaches we also investigate the ranking performance of hierarchical classifiers. Ranking, which can be achieved by providing a meaningful score for each classification decision, is important in most practical settings. We investigate an approach based on using a sigmoid function for calculating a meaningful score, where parameter estimation is based on error bounds from computational learning theory.
引用
收藏
页码:177 / 182
页数:6
相关论文
共 50 条
  • [31] Hierarchical Text Classification with Reinforced Label Assignment
    Mao, Yuning
    Tian, Jingjing
    Han, Jiawei
    Rena, Xiang
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 445 - 455
  • [32] Text Classification Based on a Novel Bayesian Hierarchical Model
    Zhou, Shibin
    Li, Kan
    Liu, Yushu
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 218 - 221
  • [33] Class Hierarchical Structure-based Text Classification
    Chen, Xiaoyun
    Chen, Jinhua
    ADVANCES IN CIVIL ENGINEERING, PTS 1-6, 2011, 255-260 : 2233 - 2237
  • [34] HIERARCHICAL TEXT CLASSIFICATION USING CNNS WITH LOCAL APPROACHES
    Krendzelak, Milan
    Jakab, Frantisek
    COMPUTING AND INFORMATICS, 2020, 39 (05) : 907 - 924
  • [35] Hierarchical Comprehensive Context Modeling for Chinese Text Classification
    Liu, Jingang
    Xia, Chunhe
    Yan, Haihua
    Xie, Zhipu
    Sun, Jie
    IEEE ACCESS, 2019, 7 : 154546 - 154559
  • [36] Hierarchical Text Classification based on LDA and Domain Ontology
    An, Wei
    Liu, Qihua
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY II, PTS 1-4, 2013, 411-414 : 1112 - +
  • [37] An effective procedure for constructing a hierarchical text classification system
    Yoon, Y
    Lee, C
    Lee, GG
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2006, 57 (03): : 431 - 442
  • [38] A Text Classification Algorithm Based on Rocchio and Hierarchical Clustering
    Zeng, Anping
    Huang, Yongping
    ADVANCED INTELLIGENT COMPUTING, 2011, 6838 : 432 - +
  • [39] Hierarchical Multilabel Text Classification via Multitask Learning
    Yu, Yipeng
    Sun, Zixun
    Sun, Chi
    Liu, Wenqiang
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 1138 - 1143
  • [40] Text Learning and Hierarchical Feature Selection in Webpage Classification
    Peng, Xiaogang
    Ming, Zhong
    Wang, Haitao
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2008, 5139 : 452 - 459