Feature Selection based on Supervised Topic Modeling for Boosting-Based Multi-Label Text Categorization

被引:0
|
作者
Al-Salemi, Bassam [1 ]
Ayob, Masri [1 ]
Noah, Shahrul Azman Mohd [1 ]
Ab Aziz, Mohd Juzaiddin [1 ]
机构
[1] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, Bangi, Malaysia
关键词
AdaBoost.MH; feature selection; text categorization; supervised topic modeling; Latent Dirichlet Allocation; ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The text representation model Bag-Of-Words is a simple and typical model which uses the single words as elements to represent the texts in the feature space. However, using the single words as features will produce a high dimensional feature space, which result in the learning computational cost, particularly for ensemble learning algorithms, such as the boosting algorithm AdaBoost.MH. The straightforward solution of this matter can be managed by using a feature selection method capable of reducing the features space effectively. This work describes how to utilize the supervised topic model Labeled Latent Dirichlet Allocation for feature selection, as well accelerating AdaBoost.MH learning for multi-label text categorization. The experimental results on three benchmarks demonstrated that using Labeled Latent Dirichlet Allocation for feature selection improves and accelerates AdaBoost.MH and exceeds the performance of three existing methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Feature Extraction of Deep Topic Model for Multi-label Text Classification
    Chen W.
    Liu X.
    Lu M.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2019, 32 (09): : 785 - 792
  • [32] Partial multi-label feature selection based on label matrix decomposition
    Guanghui Liu
    Qiaoyan Li
    Xiaofei Yang
    Zhiwei Xing
    Yingcang Ma
    Neural Computing and Applications, 2025, 37 (6) : 4207 - 4227
  • [33] Semi-Supervised Multi-Label Feature Selection based on Sparsity Regularization and Dependence Maximization
    Jiang, Lin
    Wang, Jun
    Yu, Guoxian
    2018 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2018, : 325 - 332
  • [34] Multi-Label Feature Selection Based on Min-Relevance Label
    Gao, Wanfu
    Pan, Hanlin
    IEEE ACCESS, 2023, 11 : 410 - 420
  • [35] Partial multi-label feature selection based on label distribution learning
    Lin, Yaojin
    Li, Yulin
    Lin, Shidong
    Guo, Lei
    Mao, Yu
    PATTERN RECOGNITION, 2025, 164
  • [36] Feature Redundancy Based on Interaction Information for Multi-Label Feature Selection
    Gao, Wanfu
    Hu, Juncheng
    Li, Yonghao
    Zhang, Ping
    IEEE ACCESS, 2020, 8 : 146050 - 146064
  • [37] Label correlations-based multi-label feature selection with label enhancement
    Qian, Wenbin
    Xiong, Yinsong
    Ding, Weiping
    Huang, Jintao
    Vong, Chi-Man
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [38] Toward embedding-based multi-label feature selection with label and feature collaboration
    Dai, Liang
    Zhang, Jia
    Du, Guodong
    Li, Candong
    Wei, Rong
    Li, Shaozi
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (06): : 4643 - 4665
  • [39] A New Text Semi-supervised Multi-label Learning Model Based on Using the Label-Feature Relations
    Quang-Thuy Ha
    Thi-Ngan Pham
    Van-Quang Nguyen
    Minh-Chau Nguyen
    Thanh-Huyen Pham
    Tri-Thanh Nguyen
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2018, PT I, 2018, 11055 : 403 - 413
  • [40] Toward embedding-based multi-label feature selection with label and feature collaboration
    Liang Dai
    Jia Zhang
    Guodong Du
    Candong Li
    Rong Wei
    Shaozi Li
    Neural Computing and Applications, 2023, 35 : 4643 - 4665