Stable Feature Selection using Improved Whale Optimization Algorithm for Microarray Datasets

被引:0
|
作者
Theng, Dipti [1 ]
Bhoyar, Kishor K. [2 ]
机构
[1] YCCE, Comp Technol Dept, Nagpur, Maharashtra, India
[2] YCCE, Comp Sci & Engn Dept, Nagpur, Maharashtra, India
关键词
feature selection; stability of feature selection; whale optimization algorithm; marine predator algorithm; grey wolf optimization; microarray datasets; high dimensional datasets;
D O I
10.14201/adcaij.31187
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A microarray is a collection of DNA sequences that reflect an organism's whole gene set and are organized in a grid pattern for use in genetic testing. Microarray datasets are extremely high-dimensional and have a very small sample size, posing the challenges of insufficient data and high computational complexity. Identification of true biomarkers that are the most significant features (a very small subset of the complete feature set) is desired to solve these issues. This reduces over-fitting, and time complexity, and improves model generalization. Various feature selection algorithms are used for this biomarker identification. This research proposed a modification to the whale optimization algorithm (WOAm) for biomarker discovery, in which the fitness of each search agent is evaluated using the hinge loss function during the hunting for prey phase to determine the optimal search agent. Also compared the results of the proposed modified algorithm with the original whale optimization algorithm and also with contemporary algorithms like the marine predator algorithm and grey wolf optimization. All these algorithms are evaluated on six different high-dimensional microarray datasets. It has been observed that the proposed modification for the whale optimization algorithm has significantly improved the results of feature selection across all the datasets. Domain experts trust the resultant biomarker/ associated genes by the stability of the results obtained. The chosen feature set's stability was also evaluated during the research work. According to the findings, our proposed WOAm has superior stability compared to other algorithms for the CNS, colon, Leukemia, and OSCC. datasets.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] A gene selection algorithm for microarray cancer classification using an improved particle swarm optimization
    Nagra, Arfan Ali
    Khan, Ali Haider
    Abubakar, Muhammad
    Faheem, Muhammad
    Rasool, Adil
    Masood, Khalid
    Hussain, Muzammil
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [42] An Improved Northern Goshawk Optimization Algorithm for Feature Selection
    Xie, Rongxiang
    Li, Shaobo
    Wu, Fengbin
    JOURNAL OF BIONIC ENGINEERING, 2024, 21 (04) : 2034 - 2072
  • [43] A novel gene selection algorithm for cancer classification using microarray datasets
    Alanni, Russul
    Hou, Jingyu
    Azzawi, Hasseeb
    Xiang, Yong
    BMC MEDICAL GENOMICS, 2019, 12 (1)
  • [44] A novel gene selection algorithm for cancer classification using microarray datasets
    Russul Alanni
    Jingyu Hou
    Hasseeb Azzawi
    Yong Xiang
    BMC Medical Genomics, 12
  • [45] Supervised feature selection on gene expression microarray datasets using manifold learning
    Zare, Masoumeh
    Azizizadeh, Najmeh
    Kazemipour, Ali
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2023, 237
  • [46] Multiobjective whale optimization algorithm-based feature selection for intelligent systems
    Riyahi, Milad
    Rafsanjani, Marjan K.
    Gupta, Brij B.
    Alhalabi, Wadee
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (11) : 9037 - 9054
  • [47] A binary Sine Cosine-Modified Whale Optimization Algorithm for Feature Selection
    Eid, Marwa M.
    El-kenawy, El-Sayed M.
    Ibrahim, Abdelhameed
    2021 IEEE NATIONAL COMPUTING COLLEGES CONFERENCE (NCCC 2021), 2021, : 1133 - +
  • [48] Feature selection and computational optimization in high-dimensional microarray cancer datasets via InfoGain-modified bat algorithm
    Hambali, Moshood A.
    Oladele, Tinuke O.
    Adewole, Kayode S.
    Sangaiah, Arun Kumar
    Gao, Wei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (25) : 36505 - 36549
  • [49] Feature selection and computational optimization in high-dimensional microarray cancer datasets via InfoGain-modified bat algorithm
    Moshood A. Hambali
    Tinuke O. Oladele
    Kayode S. Adewole
    Arun Kumar Sangaiah
    Wei Gao
    Multimedia Tools and Applications, 2022, 81 : 36505 - 36549
  • [50] Improving Time Series Prediction With Feature Selection Using a Velocity-Enhanced Whale Optimization Algorithm
    Das, Soumya
    Nayak, Monalisa
    Senapati, Manas Ranjan
    Majhi, Santosh
    INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH, 2022, 13 (01)