A modified genetic algorithm and weighted principal component analysis based feature selection and extraction strategy in agriculture

被引:18
|
作者
Shastry, K. Aditya [1 ]
Sanjay, H. A. [1 ]
机构
[1] Nitte Meenakshi Inst Technol, Bengaluru 64, India
关键词
Feature selection; Feature extraction; Hybrid; Genetic Algorithm; Weighted-Principal Component Analysis;
D O I
10.1016/j.knosys.2021.107460
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data pre-processing is a technique that transforms the raw data into a useful format for applying machine learning (ML) techniques. Feature selection (FS) and feature extraction (FeExt) form significant components of data pre-processing. FS is the identification of relevant features that enhances the accuracy of a model. Since, agricultural data contain diverse features related to climate, soil, fertilizer, FS attains significant importance as irrelevant features may adversely impact the prediction of the model built. Likewise, FeExt involves the derivation of new attributes from the prevailing attributes. All the information that the original attributes possess is present in these new features minus the duplicity. Keeping these points in mind, this work proposes a hybrid feature selection and feature extraction strategy for selecting features from the agricultural data set. A modified-Genetic Algorithm (m-GA) was developed by designing a fitness function based on "Mutual Information" (MutInf), and "Root Mean Square Error" (RtMSE) to choose the best features that affected the target attribute (crop yield in this case). These selected features were then subjected to feature extraction using "weighted principal component analysis" (wgt-PCA). The extracted features were then fed into different ML models viz. "Regression" (Reg), "Artificial Neural Networks" (ArtNN), "Adaptive Neuro Fuzzy Inference System" (ANFIS), "Ensemble of Trees" (EnT), and "Support Vector Regression" (SuVR). Trials on 3 benchmark and 8 real-world farming datasets revealed that the developed hybrid feature selection and extraction technique performed with significant improvements with respect to Rsq2, RtMSE, and "mean absolute error" (MAE) in comparison to FS and FeExt methods such as Correlation Analysis (CA), Singular Valued Decomposition (SiVD), Genetic Algorithm (GA), and wgt-PCA on "benchmark" and "real-world" farming datasets. (C) 2021 Published by Elsevier B.V.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Feature selection of stabilometric parameters based on principal component analysis
    Rocchi, L
    Chiari, L
    Cappello, A
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2004, 42 (01) : 71 - 79
  • [22] Modified Nonparametric Weighted Feature Extraction Algorithm
    Cui, Linlin
    Li, Guosheng
    Ren, Huiru
    He, Lei
    Liao, Huajun
    JOURNAL OF THE INDIAN SOCIETY OF REMOTE SENSING, 2015, 43 (01) : 69 - 78
  • [23] Modified Nonparametric Weighted Feature Extraction Algorithm
    Linlin Cui
    Guosheng Li
    Huiru Ren
    Lei He
    Huajun Liao
    Journal of the Indian Society of Remote Sensing, 2015, 43 : 69 - 78
  • [24] Feature extraction based on kernel principal component analysis optimized by particle swarm optimization algorithm
    Wei, Xiuye
    Pan, Hongxia
    Wang, Fujie
    Zhendong Ceshi Yu Zhenduan/Journal of Vibration, Measurement and Diagnosis, 2009, 29 (02): : 162 - 166
  • [25] Feature Selection Based on Genetic Algorithm, Particle Swarm Optimization and Principal Component Analysis for Opinion Mining Cosmetic Product Review
    Kristiyanti, Dinar Ajeng
    Wahyudi, Mochamad
    2017 5TH INTERNATIONAL CONFERENCE ON CYBER AND IT SERVICE MANAGEMENT (CITSM 2017), 2017, : 309 - 314
  • [26] An Improved Scale Invariant Feature Transform Algorithm Based on Weighted Principal Component Analysis for Image Matching
    Guo, Qianxi
    Wang, Huiyuan
    Zheng, Yongwei
    PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 1106 - 1109
  • [27] Instance selection and feature extraction using cuttlefish optimization algorithm and principal component analysis using decision tree
    M. Suganthi
    V. Karunakaran
    Cluster Computing, 2019, 22 : 89 - 101
  • [28] Instance selection and feature extraction using cuttlefish optimization algorithm and principal component analysis using decision tree
    Suganthi, M.
    Karunakaran, V.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 1): : 89 - 101
  • [29] Optimization of principal component analysis in feature extraction
    Gao Haibo
    Hong Wenxue
    Cui Jianxin
    Xu Yonghong
    2007 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS I-V, CONFERENCE PROCEEDINGS, 2007, : 3128 - 3132
  • [30] Feature extraction with weighted samples based on independent component analysis
    Kwak, Nojun
    ARTIFICIAL NEURAL NETWORKS - ICANN 2006, PT 2, 2006, 4132 : 340 - 349