Transductive transfer learning based Genetic Programming for balanced and unbalanced document classification using different types of features

被引:4
|
作者
Fu, Wenlong [1 ]
Xue, Bing [1 ]
Gao, Xiaoying [1 ]
Zhang, Mengjie [1 ]
机构
[1] Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
关键词
Genetic Programming; Document classification; Transfer learning; TEXT CLASSIFICATION; REPRESENTATIONS; WORDS; IDF;
D O I
10.1016/j.asoc.2021.107172
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document classification is one of the predominant tasks in Natural Language Processing. However, some document classification tasks do not have ground truth while other similar datasets may have ground truth. Transfer learning can utilize similar datasets with ground truth to train effective classifiers on the dataset without ground truth. This paper introduces a transductive transfer learning method for document classification using two different text feature representations?the term frequency (TF) and the semantic feature doc2vec. It has three main contributions. First, it enables the sharing knowledge in a dataset using TF and a dataset using doc2vec in transductive transfer learning for performance improvement. Second, it demonstrates that the partially learned programs from TFs and from doc2vecs can be alternatively used to ?label then learn?and they improve each other. Lastly, it addresses the unbalanced dataset problem by considering the unbalanced distributions on categories for evolving proper Genetic Programming (GP) programs on the target domains. Our experimental results on two popular document datasets show that the proposed technique effectively transfers knowledge from the GP programs evolved from the source domains to the new GP programs on the target domains using TF or doc2vec. There are obviously more than 10 percentages improvement achieved by the GP programs evolved by the proposed method over the GP programs directly evolved from the source domains. Also, the proposed technique effectively utilizes GP programs evolved from unbalanced datasets (on the source and target domains) to evolve new GP programs on the target domains, which balances predictions on different categories. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Classification of document pages using structure-based features
    Shin C.
    Doermann D.
    Rosenfeld A.
    International Journal on Document Analysis and Recognition, 2001, 3 (04) : 232 - 247
  • [22] CAD System for Classification of Mammographic Abnormalities using Transductive Semi Supervised Learning Algorithm and Heterogeneous Features
    Zemmal, Nawel
    Azizi, Nabiha
    Sellami, Mokhtar
    2015 12TH IEEE INTERNATIONAL CONFERENCE ON PROGRAMMING AND SYSTEMS (ISPS), 2015, : 245 - 253
  • [23] Instance based Transfer Learning for Genetic Programming for Symbolic Regression
    Chen, Qi
    Xue, Bing
    Zhang, Mengjie
    2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 3006 - 3013
  • [24] Detection and Classification of Different Weapon Types Using Deep Learning
    Kaya, Volkan
    Tuncer, Servet
    Baran, Ahmet
    APPLIED SCIENCES-BASEL, 2021, 11 (16):
  • [25] A Gaussian Filter-Based Feature Learning Approach Using Genetic Programming to Image Classification
    Bi, Ying
    Xue, Bing
    Zhang, Mengjie
    AI 2018: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11320 : 251 - 257
  • [26] Histological Image Classification using Deep Features and Transfer Learning
    Alinsaif, Sadiq
    Lang, Jochen
    2020 17TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV 2020), 2020, : 101 - 108
  • [27] Lazy Learning for Multi-class Classification Using Genetic Programming
    Jabeen, Hajira
    Baig, Abdul Rauf
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2012, 6839 : 177 - +
  • [28] An Automated Ensemble Learning Framework Using Genetic Programming for Image Classification
    Bi, Ying
    Xue, Bing
    Zhang, Mengjie
    PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'19), 2019, : 365 - 373
  • [29] Learning discriminant functions with fuzzy attributes for classification using genetic programming
    Chien, BC
    Lin, JY
    Hong, TP
    EXPERT SYSTEMS WITH APPLICATIONS, 2002, 23 (01) : 31 - 37
  • [30] Genetic Programming-Based Feature Learning for Facial Expression Classification
    Bi, Ying
    Xue, Bing
    Zhang, Mengjie
    2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,