Feature Selection based on Supervised Topic Modeling for Boosting-Based Multi-Label Text Categorization

被引:0
|
作者
Al-Salemi, Bassam [1 ]
Ayob, Masri [1 ]
Noah, Shahrul Azman Mohd [1 ]
Ab Aziz, Mohd Juzaiddin [1 ]
机构
[1] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, Bangi, Malaysia
关键词
AdaBoost.MH; feature selection; text categorization; supervised topic modeling; Latent Dirichlet Allocation; ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The text representation model Bag-Of-Words is a simple and typical model which uses the single words as elements to represent the texts in the feature space. However, using the single words as features will produce a high dimensional feature space, which result in the learning computational cost, particularly for ensemble learning algorithms, such as the boosting algorithm AdaBoost.MH. The straightforward solution of this matter can be managed by using a feature selection method capable of reducing the features space effectively. This work describes how to utilize the supervised topic model Labeled Latent Dirichlet Allocation for feature selection, as well accelerating AdaBoost.MH learning for multi-label text categorization. The experimental results on three benchmarks demonstrated that using Labeled Latent Dirichlet Allocation for feature selection improves and accelerates AdaBoost.MH and exceeds the performance of three existing methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Feature ranking for enhancing boosting-based multi-label text categorization
    Al-Salemi, Bassam
    Ayob, Masri
    Noah, Shahrul Azman Mohd
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 113 : 531 - 543
  • [2] Boosting-based Multi-label Classification
    Kajdanowicz, Tomasz
    Kazienko, Przemyslaw
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2013, 19 (04) : 502 - 520
  • [3] Boosting algorithms with topic modeling for multi-label text categorization: A comparative empirical study
    Al-Salemi, Bassam
    Ab Aziz, Mohd. Juzaiddin
    Noah, Shahrul Azman
    JOURNAL OF INFORMATION SCIENCE, 2015, 41 (05) : 732 - 746
  • [4] A multi-label Chinese text categorization system based on boosting algorithm
    Chen, JL
    Zhou, XZ
    Wu, ZH
    FOURTH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, PROCEEDINGS, 2004, : 1153 - 1158
  • [5] Document transformation for multi-label feature selection in text categorization
    Chen, Weizhu
    Yan, Jun
    Zhang, Benyu
    Chen, Zheng
    Yang, Qiang
    ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 451 - +
  • [6] Boosting multi-label hierarchical text categorization
    Esuli, Andrea
    Fagni, Tiziano
    Sebastiani, Fabrizio
    INFORMATION RETRIEVAL, 2008, 11 (04): : 287 - 313
  • [7] Boosting multi-label hierarchical text categorization
    Andrea Esuli
    Tiziano Fagni
    Fabrizio Sebastiani
    Information Retrieval, 2008, 11 : 287 - 313
  • [8] A Feature Selection Method for Multi-Label Text Based on Feature Importance
    Zhang, Lu
    Duan, Qingling
    APPLIED SCIENCES-BASEL, 2019, 9 (04):
  • [9] BoosTexter: A boosting-based system for text categorization
    Schapire, RE
    Singer, Y
    MACHINE LEARNING, 2000, 39 (2-3) : 135 - 168