An Improved Genetic Algorithm for Feature Selection in the Classification of Disaster-Related Twitter Messages

被引:0
|
作者
Benitez, Ian P. [1 ]
Sison, Ariel M. [2 ]
Medina, Ruji P. [3 ]
机构
[1] Technol Inst Philippines, Quezon City, Philippines
[2] Emilio Aguinaldo Coll, Sch Comp Studies, Manila, Philippines
[3] Technol Inst Philippines, Grad Programs, Quezon City, Philippines
关键词
GA-based Feature Selection; Supervised Learning; Multiclass Classification; Wrapper-based; Twitter;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In text classification with machine learning, utilizing terms as features using vector space representation can result in the high dimensionality of feature space. This condition introduces problems including high computational cost in data analysis, as well as degradation of classification accuracy. This study improved classifier's performance in the classification of natural crisis-related twitter messages. Feature space dimensionality through feature selection was reduced using Genetic Algorithm (GA). While there is a limitation of GA implementation in text feature selection which is the premature convergence due to lack of population diversity in the subsequent generations, GA was enhanced in its crossover operator through: a) setting a variable slice-point on the size of genes to be swapped for every offspring creation, b) using features' frequency scores in deciding the swapping of genes. Several Twitter datasets were tested applying the algorithm enhancement and performed a comparative analysis with two standard GA implementation that uses a single-point and multi-point crossover. Experimental results showed the superiority of the enhanced GA in terms of reducing the number of selected features and in improving classification accuracy using Multinomial Naive Bayes.
引用
收藏
页码:238 / 243
页数:6
相关论文
共 50 条
  • [1] Implementation of GA-Based Feature Selection in the Classification and Mapping of Disaster-Related Tweets
    Benitez, Ian P.
    Sison, Ariel M.
    Medina, Ruji P.
    PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL (NLPIR 2018), 2018, : 1 - 6
  • [2] A Computationally Efficient Multi-modal Classification Approach of Disaster-related Twitter Images
    Rizk, Yara
    Jomaa, Hadi Samer
    Awad, Mariette
    Castillo, Carlos
    SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 2050 - 2059
  • [3] Feature Selection for Twitter Classification
    Ostrowski, David Alfred
    2014 IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2014, : 267 - 272
  • [4] Genetic Algorithm–Aided Deep Feature Selection for Improved Rice Disease Classification
    Rahul Sharma
    Amar Singh
    Prashant Kumar
    Mahipal Singh
    Operations Research Forum, 6 (1)
  • [5] An Improved Firefly Algorithm for Feature Selection in Classification
    Xu, Huali
    Yu, Shuhao
    Chen, Jiajun
    Zuo, Xukun
    WIRELESS PERSONAL COMMUNICATIONS, 2018, 102 (04) : 2823 - 2834
  • [6] An Improved Feature Selection Algorithm for Ordinal Classification
    Pan, Weiwei
    Hu, Qinhua
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (12): : 2266 - 2274
  • [7] An Improved Firefly Algorithm for Feature Selection in Classification
    Huali Xu
    Shuhao Yu
    Jiajun Chen
    Xukun Zuo
    Wireless Personal Communications, 2018, 102 : 2823 - 2834
  • [8] Sequential Deep Learning for Disaster-Related Video Classification
    Tian, Haiman
    Zheng, Hector Cen
    Chen, Shu-Ching
    IEEE 1ST CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2018), 2018, : 106 - 111
  • [9] Improvement of Twitter-based Disaster-related Information Sharing System
    Kosugi, Masafumi
    Utsu, Keisuke
    Tajima, Sachi
    Tomita, Makoto
    Kajita, Yoshitaka
    Yamamoto, Yoshiro
    Uchida, Osamu
    2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES FOR DISASTER MANAGEMENT (ICT-DM), 2017,
  • [10] Improved Feature Selection Algorithm for Biological Sequences Classification
    Guannoni, Naoual
    Mhamdi, Faouzi
    Elloumi, Mourad
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I, 2019, 11775 : 689 - 700