Experiments with hierarchical text classification

被引:0
|
作者
Granitzer, M [1 ]
Auer, P [1 ]
机构
[1] Know Ctr, Div Knowledge Discovery, A-8010 Graz, Austria
关键词
machine learning; supervised learning; hierarchical text classification; boosting; ranking performance;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper applies Boosting to hierarchical text classification where the hierarchical structure is given as directed acyclic graph and compares the results to Support Vector Machines. Hierarchical classification is performed top-down and in each node a flat classifier decides if a document should be further propagated or not. As flat classifiers BoosTexter, CentroidBooster and Support Vector Machines are used, were CentroidBooster is an AdaBoost.MH based alternative similar to BoosTexter. Experiments on the Reuters Corpus Volume 1 and the OHSUMED data set show that the F-1-measure increases if the hierarchal structure of a data set is taken into account. Regarding time complexity we show, that depending on the structure of a hierarchy, learning and classification time can be reduced. Besides these hard classification approaches we also investigate the ranking performance of hierarchical classifiers. Ranking, which can be achieved by providing a meaningful score for each classification decision, is important in most practical settings. We investigate an approach based on using a sigmoid function for calculating a meaningful score, where parameter estimation is based on error bounds from computational learning theory.
引用
收藏
页码:177 / 182
页数:6
相关论文
共 50 条
  • [41] Text Classification using Hierarchical Sparse Representation Classifiers
    Sharma, Neeraj
    Dileep, A. D.
    Thenkanidiyoor, Veena
    2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 1015 - 1019
  • [42] Hierarchical text classification based on support vector machines
    Jin, Ting
    Lei, Jingsheng
    Journal of Information and Computational Science, 2009, 6 (01): : 543 - 551
  • [43] An analysis of hierarchical text classification using word embeddings
    Stein, Roger Alan
    Jaques, Patricia A.
    Valiati, Joao Francisco
    INFORMATION SCIENCES, 2019, 471 : 216 - 232
  • [44] Hierarchical Hamming clustering model in text document classification
    Diao, Q
    Diao, HN
    Wang, YC
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN & COMPUTER GRAPHICS, 1999, : 1299 - 1303
  • [45] Hierarchical text classification using CNNs with local approaches
    Krendzelak M.
    Jakab F.
    Computing and Informatics, 2021, 39 (05) : 907 - 924
  • [46] HIERARCHICAL TEXT CLASSIFICATION FOR WEB OF SCIENCE SCIENTIFIC FIELDS
    Rad, Pouyan Jahani
    Bahaghighat, Mahdi
    FACTA UNIVERSITATIS-SERIES ELECTRONICS AND ENERGETICS, 2024, 37 (04) : 703 - 732
  • [47] Hierarchical approaches to Text-based Offense Classification
    Choi, Jay
    Kilmer, David
    Mueller-Smith, Michael
    Taheri, Sema A.
    SCIENCE ADVANCES, 2023, 9 (09)
  • [48] Towards Better Hierarchical Text Classification with Data Generation
    Wang, Yue
    Qiao, Dan
    Li, Juntao
    Chang, Jinxiong
    Zhang, Qishen
    Liu, Zhongyi
    Zhang, Guannan
    Zhang, Min
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 7722 - 7739
  • [49] Peer-Label Assisted Hierarchical Text Classification
    Song, Junru
    Wang, Feifei
    Yang, Yang
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3747 - 3758
  • [50] Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification
    Wang, Zihan
    Wang, Peiyi
    Huang, Lianzhe
    Sun, Xin
    Wang, Houfeng
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7109 - 7119