Distributed feature selection (DFS) strategy for microarray gene expression data to improve the classification performance

被引:24
|
作者
Potharaju, Sai Prasad [1 ]
Sreedevi, M. [1 ]
机构
[1] KL Univ, Dept CSE, Guntur, AP, India
来源
关键词
Microarray; Feature selection; Classification; High dimensionality;
D O I
10.1016/j.cegh.2018.04.001
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Objective: The objective of this research article is to present a novel feature selection strategy for improving the classification performance over high dimensional data sets. Curse of dimensionality is the most serious downside of microarray data as it has more number of genes(features). This leads to discouraged computational stability. In microarray data analytics, identifying more relevant features required full attention. Most of the researchers applied two stage strategy for gene expression data analysis. In first stage, feature selection or feature extraction is employed as a preprocessing step to pinpoint more prominent features. In second stage, classification is applied using selected subset of features. Method: In this research also we followed the same strategy. But, we tried to introduce a distributed feature selection(dfs) strategy using Symmetrical Uncertainty(SU) and Multi Layer Perceptron(MLP) by distributing across the multiple clusters. Each cluster is equipped with finite number of features in it. MLP is employed over each cluster, and based on the highest accuracy and lowest Root Mean Square error rate(RMS) dominant cluster is nominated. Result: Classification accuracy with Ridor, Simple Cart (SC), KNN, SVM are measured by considering dominant cluster's features. The performance of this cluster is compared with the traditional filter based ranking techniques like Information Gain(IG), Gain Ratio Attribute Evaluator(GRAE), Chi-Squared Attribute Evaluator (Chi). The proposed method is recorded approximately 57% success rate, 18% competitive rate against traditional methods after applying it over 7 well high dimensional and one lower dimension dataset. Conclusion: The proposed methodology applied over very high dimensional microarry datasets. Using this method memory consumption will be reduced and classification performance can be improved.
引用
收藏
页码:171 / 176
页数:6
相关论文
共 50 条
  • [31] Feature selection methods in microarray gene expression data: a systematic mapping study
    Mahnaz Vahmiyan
    Mohammadtaghi Kheirabadi
    Ebrahim Akbari
    Neural Computing and Applications, 2022, 34 : 19675 - 19702
  • [32] Feature selection methods in microarray gene expression data: a systematic mapping study
    Vahmiyan, Mahnaz
    Kheirabadi, Mohammadtaghi
    Akbari, Ebrahim
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (22): : 19675 - 19702
  • [33] A Top-r Feature Selection Algorithm for Microarray Gene Expression Data
    Sharma, Alok
    Imoto, Seiya
    Miyano, Satoru
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (03) : 754 - 764
  • [34] Feature Selection in Microarray Gene Expression Data Using Fisher Discriminant Ratio
    Sarbazi-Azad, Saeed
    Abadeh, Mohammad Saniee
    Abadi, Mehdi Irannejad Najaf
    2018 8TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2018, : 225 - 230
  • [35] Microarray Gene Expression Dataset Feature Selection and Classification with Swarm Optimization to Diagnosis Diseases
    Krishna, Peddarapu Rama
    Rajarajeswari, Pothuraju
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 536 - 546
  • [36] Efficient gene selection for classification of microarray data
    Ho, SY
    Lee, CC
    Chen, HM
    Huang, HL
    2005 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-3, PROCEEDINGS, 2005, : 1753 - 1760
  • [37] Gene selection for cancer classification in microarray data
    Zhang, Lijuan
    Li, Zhoujun
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2009, 46 (05): : 794 - 802
  • [38] Spatial clustering based gene selection for gene expression analysis in microarray data classification
    Dhas, P. Edwin
    Lalitha, S.
    Govindaraj, Annalakshmi
    Jyoshna, B.
    AUTOMATIKA, 2024, 65 (01) : 152 - 158
  • [39] Cancer Classification through Feature Selection and Transductive SVM Using Gene Microarray Data
    Chakraborty, Debasis
    Das, Shibu
    2012 THIRD INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2012, : 77 - 80
  • [40] A new distributed feature selection technique for classifying gene expression data
    Ayyad, Sarah M.
    Saleh, Ahmed, I
    Labib, Labib M.
    INTERNATIONAL JOURNAL OF BIOMATHEMATICS, 2019, 12 (04)