A modified genetic algorithm and weighted principal component analysis based feature selection and extraction strategy in agriculture

被引:18
|
作者
Shastry, K. Aditya [1 ]
Sanjay, H. A. [1 ]
机构
[1] Nitte Meenakshi Inst Technol, Bengaluru 64, India
关键词
Feature selection; Feature extraction; Hybrid; Genetic Algorithm; Weighted-Principal Component Analysis;
D O I
10.1016/j.knosys.2021.107460
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data pre-processing is a technique that transforms the raw data into a useful format for applying machine learning (ML) techniques. Feature selection (FS) and feature extraction (FeExt) form significant components of data pre-processing. FS is the identification of relevant features that enhances the accuracy of a model. Since, agricultural data contain diverse features related to climate, soil, fertilizer, FS attains significant importance as irrelevant features may adversely impact the prediction of the model built. Likewise, FeExt involves the derivation of new attributes from the prevailing attributes. All the information that the original attributes possess is present in these new features minus the duplicity. Keeping these points in mind, this work proposes a hybrid feature selection and feature extraction strategy for selecting features from the agricultural data set. A modified-Genetic Algorithm (m-GA) was developed by designing a fitness function based on "Mutual Information" (MutInf), and "Root Mean Square Error" (RtMSE) to choose the best features that affected the target attribute (crop yield in this case). These selected features were then subjected to feature extraction using "weighted principal component analysis" (wgt-PCA). The extracted features were then fed into different ML models viz. "Regression" (Reg), "Artificial Neural Networks" (ArtNN), "Adaptive Neuro Fuzzy Inference System" (ANFIS), "Ensemble of Trees" (EnT), and "Support Vector Regression" (SuVR). Trials on 3 benchmark and 8 real-world farming datasets revealed that the developed hybrid feature selection and extraction technique performed with significant improvements with respect to Rsq2, RtMSE, and "mean absolute error" (MAE) in comparison to FS and FeExt methods such as Correlation Analysis (CA), Singular Valued Decomposition (SiVD), Genetic Algorithm (GA), and wgt-PCA on "benchmark" and "real-world" farming datasets. (C) 2021 Published by Elsevier B.V.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Heterogeneous Network Selection Algorithm Based on Principal Component Analysis
    Wang, Xin-gang
    Zhu, Bin-ruo
    Zhu, Zheng
    2018 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS AND MECHATRONICS ENGINEERING (CCME 2018), 2018, 332 : 613 - 618
  • [32] Feature Extraction based on Mixture Probabilistic Kernel Principal Component Analysis
    Zhao Huibo
    Pan Quan
    Cheng Yongmei
    2009 INTERNATIONAL FORUM ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 3, PROCEEDINGS, 2009, : 36 - 39
  • [33] WEB SERVICE SELECTION ALGORITHM BASED ON PRINCIPAL COMPONENT ANALYSIS
    Kang Guosheng
    Liu Jianxun
    Tang Mingdong
    Cao Buqing
    JournalofElectronics(China), 2013, 30 (02) : 204 - 212
  • [34] STUDY ON FEATURE EXTRACTION OF PIG FACE BASED ON PRINCIPAL COMPONENT ANALYSIS
    Yan, Hongwen
    Hu, Zhiwei
    Cui, Qingliang
    INMATEH-AGRICULTURAL ENGINEERING, 2022, 68 (03): : 333 - 340
  • [35] Modified Floating Search Feature Selection Based on Genetic Algorithm
    Homsapaya, Kanyanut
    Sornil, Ohm
    3RD INTERNATIONAL CONFERENCE ON ELECTRICAL SYSTEMS, TECHNOLOGY AND INFORMATION (ICESTI 2017), 2018, 164
  • [36] Machine condition monitoring by nonlinear feature fusion based on kernel principal component analysis with genetic algorithm
    Wang, Feng
    Cheng, Bo
    Cao, Binggang
    ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 2, PROCEEDINGS, 2007, : 665 - +
  • [37] Feature Extraction of Global Seismicity by Principal Component Analysis
    Okada, Akihisa
    Toriumi, Mitsuhiro
    Kaneda, Yoshiyuki
    2017 INTERNATIONAL CONFERENCE ON CONTROL, ARTIFICIAL INTELLIGENCE, ROBOTICS & OPTIMIZATION (ICCAIRO), 2017, : 278 - 282
  • [38] Extensions of principal component analysis for nonlinear feature extraction
    Sudjianto, A
    Hassoun, MH
    Wasserman, GS
    ICNN - 1996 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS. 1-4, 1996, : 1433 - 1434
  • [39] Supervised feature selection using principal component analysis
    Rahmat, Fariq
    Zulkafli, Zed
    Ishak, Asnor Juraiza
    Rahman, Ribhan Zafira Abdul
    De Stercke, Simon
    Buytaert, Wouter
    Tahir, Wardah
    Ab Rahman, Jamalludin
    Ibrahim, Salwa
    Ismail, Muhamad
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (03) : 1955 - 1995
  • [40] Supervised feature selection using principal component analysis
    Fariq Rahmat
    Zed Zulkafli
    Asnor Juraiza Ishak
    Ribhan Zafira Abdul Rahman
    Simon De Stercke
    Wouter Buytaert
    Wardah Tahir
    Jamalludin Ab Rahman
    Salwa Ibrahim
    Muhamad Ismail
    Knowledge and Information Systems, 2024, 66 : 1955 - 1995