Feature Selection based on Supervised Topic Modeling for Boosting-Based Multi-Label Text Categorization

被引:0
|
作者
Al-Salemi, Bassam [1 ]
Ayob, Masri [1 ]
Noah, Shahrul Azman Mohd [1 ]
Ab Aziz, Mohd Juzaiddin [1 ]
机构
[1] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, Bangi, Malaysia
关键词
AdaBoost.MH; feature selection; text categorization; supervised topic modeling; Latent Dirichlet Allocation; ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The text representation model Bag-Of-Words is a simple and typical model which uses the single words as elements to represent the texts in the feature space. However, using the single words as features will produce a high dimensional feature space, which result in the learning computational cost, particularly for ensemble learning algorithms, such as the boosting algorithm AdaBoost.MH. The straightforward solution of this matter can be managed by using a feature selection method capable of reducing the features space effectively. This work describes how to utilize the supervised topic model Labeled Latent Dirichlet Allocation for feature selection, as well accelerating AdaBoost.MH learning for multi-label text categorization. The experimental results on three benchmarks demonstrated that using Labeled Latent Dirichlet Allocation for feature selection improves and accelerates AdaBoost.MH and exceeds the performance of three existing methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Multi-label feature selection based on multi-granulation separability
    Yao, Erliang
    Li, Deyu
    Qian, Yuhua
    Fu, Xiaozhen
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (04)
  • [42] Improving Multi-Label Medical Text Classification by Feature Selection
    Glinka, Kinga
    Wozniak, Rafal
    Zakrzewska, Danuta
    2017 IEEE 26TH INTERNATIONAL CONFERENCE ON ENABLING TECHNOLOGIES - INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES (WETICE), 2017, : 176 - 181
  • [43] Efficient Multi-Label Feature Selection Using Entropy-Based Label Selection
    Lee, Jaesung
    Kim, Dae-Won
    ENTROPY, 2016, 18 (11)
  • [44] Multi-label feature selection method based on dynamic weight
    Zhang, Ping
    Sheng, Jiyao
    Gao, Wanfu
    Hu, Juncheng
    Li, Yonghao
    SOFT COMPUTING, 2022, 26 (06) : 2793 - 2805
  • [45] Multi-label feature selection method based on dynamic weight
    Ping Zhang
    Jiyao Sheng
    Wanfu Gao
    Juncheng Hu
    Yonghao Li
    Soft Computing, 2022, 26 : 2793 - 2805
  • [46] Multi-label feature selection based on neighborhood mutual information
    Lin, Yaojin
    Hu, Qinghua
    Liu, Jinghua
    Chen, Jinkun
    Duan, Jie
    APPLIED SOFT COMPUTING, 2016, 38 : 244 - 256
  • [47] Multi-label Text Categorization Based on Feature Optimization using Ant Colony Optimization and Relevance Clustering Technique
    Nema, Puneet
    Sharma, Vivek
    2015 INTERNATIONAL CONFERENCE ON COMPUTERS, COMMUNICATIONS, AND SYSTEMS (ICCCS), 2015, : 1 - 5
  • [48] Supervised topic models for multi-label classification
    Li, Ximing
    Ouyang, Jihong
    Zhou, Xiaotang
    NEUROCOMPUTING, 2015, 149 : 811 - 819
  • [49] Multi-label feature selection based on dynamic graph Laplacian
    Li Y.
    Hu L.
    Zhang P.
    Gao W.
    Tongxin Xuebao/Journal on Communications, 2020, 41 (12): : 47 - 59
  • [50] Granular multi-label feature selection based on mutual information
    Li, Feng
    Miao, Duoqian
    Pedrycz, Witold
    PATTERN RECOGNITION, 2017, 67 : 410 - 423