A modified genetic algorithm and weighted principal component analysis based feature selection and extraction strategy in agriculture

被引:18
|
作者
Shastry, K. Aditya [1 ]
Sanjay, H. A. [1 ]
机构
[1] Nitte Meenakshi Inst Technol, Bengaluru 64, India
关键词
Feature selection; Feature extraction; Hybrid; Genetic Algorithm; Weighted-Principal Component Analysis;
D O I
10.1016/j.knosys.2021.107460
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data pre-processing is a technique that transforms the raw data into a useful format for applying machine learning (ML) techniques. Feature selection (FS) and feature extraction (FeExt) form significant components of data pre-processing. FS is the identification of relevant features that enhances the accuracy of a model. Since, agricultural data contain diverse features related to climate, soil, fertilizer, FS attains significant importance as irrelevant features may adversely impact the prediction of the model built. Likewise, FeExt involves the derivation of new attributes from the prevailing attributes. All the information that the original attributes possess is present in these new features minus the duplicity. Keeping these points in mind, this work proposes a hybrid feature selection and feature extraction strategy for selecting features from the agricultural data set. A modified-Genetic Algorithm (m-GA) was developed by designing a fitness function based on "Mutual Information" (MutInf), and "Root Mean Square Error" (RtMSE) to choose the best features that affected the target attribute (crop yield in this case). These selected features were then subjected to feature extraction using "weighted principal component analysis" (wgt-PCA). The extracted features were then fed into different ML models viz. "Regression" (Reg), "Artificial Neural Networks" (ArtNN), "Adaptive Neuro Fuzzy Inference System" (ANFIS), "Ensemble of Trees" (EnT), and "Support Vector Regression" (SuVR). Trials on 3 benchmark and 8 real-world farming datasets revealed that the developed hybrid feature selection and extraction technique performed with significant improvements with respect to Rsq2, RtMSE, and "mean absolute error" (MAE) in comparison to FS and FeExt methods such as Correlation Analysis (CA), Singular Valued Decomposition (SiVD), Genetic Algorithm (GA), and wgt-PCA on "benchmark" and "real-world" farming datasets. (C) 2021 Published by Elsevier B.V.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Feature selection using principal component analysis and genetic algorithm
    Adhao, Rahul
    Pachghare, Vinod
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2020, 23 (02): : 595 - 602
  • [2] COMBINING FEATURE SELECTION WITH EXTRACTION: UNSUPERVISED FEATURE SELECTION BASED ON PRINCIPAL COMPONENT ANALYSIS
    Li, Yun
    Lu, Bao-Liang
    Zhang, Teng-Fei
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2009, 18 (06) : 883 - 904
  • [3] Feature Selection Algorithm for Motor Quality Types Using Weighted Principal Component Analysis
    Yeh, Yun-Chi
    Lin, Liuh-Chii
    Liu, Mei-Chen
    Chu, Tsui-Shiun
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT TECHNOLOGIES AND ENGINEERING SYSTEMS (ICITES2014), 2016, 345 : 151 - 157
  • [4] Feature extraction using evolutionary weighted principal component analysis
    Liu, N
    Wang, H
    INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOL 1-4, PROCEEDINGS, 2005, : 346 - 350
  • [5] A Novel Principal Component Selection Strategy for Feature Extraction and Dimension Reduction
    Cui, Mingliang
    Ma, Xin
    Li, Qiankun
    Hou, Tongze
    Wang, Youqing
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 4040 - 4045
  • [6] A Reliable Feature Selection Algorithm for Determining Heartbeat Case using Weighted Principal Component Analysis
    Yeh, Yun-Chi
    Chen, Chun-Wei
    Chiou, Che Wun
    Chu, Tsui-Yao
    2016 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING (ICSSE), 2016,
  • [7] Feature Frequency Extraction Algorithm Based on Principal Component Analysis and Its Application
    Li Z.
    Li W.
    Zhao X.
    Zheng X.
    Li, Weiguang (wguangli@scut.edu.cn), 2018, Nanjing University of Aeronautics an Astronautics (38): : 834 - 842
  • [8] Weighted principal component extraction with genetic algorithms
    Liu, Nan
    Wang, Han
    APPLIED SOFT COMPUTING, 2012, 12 (02) : 961 - 974
  • [9] Principal Component Analysis based Feature Selection for clustering
    Xu, Jun-Ling
    Xu, Bao-Wen
    Zhang, Wei-Feng
    Cui, Zi-Feng
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 460 - +
  • [10] Feature selection based on improved principal component analysis
    Li, Zhangyu
    Qiu, Yihui
    2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023, 2023, : 188 - 192