Feature Selection for Microarray Gene Expression Data Using Simulated Annealing Guided by the Multivariate Joint Entropy

被引:14
|
作者
Fernando Gonzalez-Navarro, Felix [1 ]
Belanche-Munoz, Lluis A. [2 ]
机构
[1] Univ Autonoma Baja California, Inst Ingn, Mexicali, Baja California, Mexico
[2] Univ Politecn Cataluna, Dept Llenguatges & Sistemes Informat, Barcelona, Spain
来源
COMPUTACION Y SISTEMAS | 2014年 / 18卷 / 02期
关键词
Feature selection; microarray gene expression data; multivariate joint entropy; simulated annealing;
D O I
10.13053/CyS-18-2-2014-032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Microarray classification poses many challenges for data analysis, given that a gene expression data set may consist of dozens of observations with thousands or even tens of thousands of genes. In this context, feature subset selection techniques can be very useful to reduce the representation space to one that is manageable by classification techniques. In this work we use the discretized multivariate joint entropy as the basis for a fast evaluation of gene relevance in a Microarray Gene Expression context. The proposed algorithm combines a simulated annealing schedule specially designed for feature subset selection with the incrementally computed joint entropy, reusing previous values to compute current feature subset relevance. This combination turns out to be a powerful tool when applied to the maximization of gene subset relevance. Our method delivers highly interpretable solutions that are more accurate than competing methods. The algorithm is fast, effective and has no critical parameters. The experimental results in several public-domain microarray data sets show a notoriously high classification performance and low size subsets, formed mostly by biologically meaningful genes. The technique is general and could be used in other similar scenarios.
引用
收藏
页码:275 / 293
页数:19
相关论文
共 50 条
  • [41] Hybrid Feature Selection Method using Gene Expression Data
    Chuang, Li-Yeh
    Wu, Kuo-Chuan
    Yang, Cheng-Hong
    2008 IEEE CONFERENCE ON SOFT COMPUTING IN INDUSTRIAL APPLICATIONS SMCIA/08, 2009, : 199 - +
  • [42] Feature Selection for Learning-to-Rank using Simulated Annealing
    Allvi, Mustafa Wasif
    Hasan, Mahamudul
    Rayon, Lazim
    Shahabuddin, Mohammad
    Khan, Md Mosaddek
    Ibrahim, Muhammad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (03) : 699 - 705
  • [43] An Entropy-based gene selection method for cancer classification using microarray data
    Xiaoxing Liu
    Arun Krishnan
    Adrian Mondry
    BMC Bioinformatics, 6 (1)
  • [44] An Entropy-based gene selection method for cancer classification using microarray data
    Liu, XX
    Krishnan, A
    Mondry, A
    BMC BIOINFORMATICS, 2005, 6
  • [45] An Entropy-based gene selection method for cancer classification using microarray data
    Liu, XX
    Krishnan, A
    Mondry, A
    BMC BIOINFORMATICS, 2005, 6
  • [46] Feature Subset Selection within a Simulated Annealing Data Mining Algorithm
    Debuse J.C.W.
    Rayward-Smith V.J.
    Journal of Intelligent Information Systems, 1997, 9 (1) : 57 - 81
  • [47] FEATURE SELECTION FOR MICROARRAY DATA USING PROBABILITY DISTANCES
    Korenblat, K.
    Volkovich, Z.
    JP JOURNAL OF BIOSTATISTICS, 2012, 7 (01) : 15 - 34
  • [48] Supervised classification and gene selection using simulated annealing
    Filippone, Maurizio
    Masulli, Francesco
    Rovetta, Stefano
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 3566 - +
  • [49] Unsupervised gene selection and clustering using simulated annealing
    Filippone, M
    Masulli, F
    Rovetta, S
    FUZZY LOGIC AND APPLICATIONS, 2006, 3849 : 229 - 235
  • [50] A Kernel-Based Multivariate Feature Selection Method for Microarray Data Classification
    Sun, Shiquan
    Peng, Qinke
    Shakoor, Adnan
    PLOS ONE, 2014, 9 (07):