A kernel-based clustering method for gene selection with gene expression data

被引:48
|
作者
Chen, Huihui [1 ]
Zhang, Yusen [1 ]
Gutman, Ivan [2 ]
机构
[1] Shandong Univ Weihai, Sch Math & Stat, Weihai 264209, Peoples R China
[2] Univ Kragujevac, Fac Sci, POB 60, Kragujevac 34000, Serbia
关键词
Gene expression data; Kernel-based clustering; Adaptive distance; Gene selection; Cancer classification; CANCER CLASSIFICATION; PREDICTION; ALGORITHM; DISCOVERY;
D O I
10.1016/j.jbi.2016.05.007
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Gene selection is important for cancer classification based on gene expression data, because of high dimensionality and small sample size. In this paper, we present a new gene selection method based on clustering, in which dissimilarity measures are obtained through kernel functions. It searches for best weights of genes iteratively at the same time to optimize the clustering objective function. Adaptive distance is used in the process, which is suitable to learn the weights of genes during the clustering process, improving the performance of the algorithm. The proposed algorithm is simple and does not require any modification or parameter optimization for each dataset. We tested it on eight publicly available datasets, using two classifiers (support vector machine, k-nearest neighbor), compared with other six competitive feature selectors. The results show that the proposed algorithm is capable of achieving better accuracies and may be an efficient tool for finding possible biomarkers from gene expression data. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:12 / 20
页数:9
相关论文
共 50 条
  • [31] Combining gene annotations and gene expression data in model-based clustering: Weighted method
    Huang, Desheng
    Wei, Peng
    Pan, Wei
    OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2006, 10 (01) : 28 - 39
  • [32] Projection Based Clustering of Gene Expression Data
    Tasoulis, Sotiris K.
    Plagianakos, Vassilis P.
    Tasoulis, Dimitris K.
    COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, 2010, 6160 : 228 - +
  • [33] Attribute clustering for grouping, selection, and classification of gene expression data
    Au, WH
    Chan, KCC
    Wong, AKC
    Wang, Y
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2005, 2 (02) : 83 - 101
  • [34] An Agent-Based Clustering Approach for Gene Selection in Gene Expression Microarray
    Ramos, Juan
    Castellanos-Garzon, Jose A.
    Gonzalez-Briones, Alfonso
    de Paz, Juan F.
    Corchado, Juan M.
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2017, 9 (01) : 1 - 13
  • [35] An Agent-Based Clustering Approach for Gene Selection in Gene Expression Microarray
    Juan Ramos
    José A. Castellanos-Garzón
    Alfonso González-Briones
    Juan F. de Paz
    Juan M. Corchado
    Interdisciplinary Sciences: Computational Life Sciences, 2017, 9 : 1 - 13
  • [36] Null space based feature selection method for gene expression data
    Sharma, Alok
    Imoto, Seiya
    Miyano, Satoru
    Sharma, Vandana
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2012, 3 (04) : 269 - 276
  • [37] Null space based feature selection method for gene expression data
    Alok Sharma
    Seiya Imoto
    Satoru Miyano
    Vandana Sharma
    International Journal of Machine Learning and Cybernetics, 2012, 3 : 269 - 276
  • [38] A model selection criterion for model-based clustering of annotated gene expression data
    Gallopin, Melina
    Celeux, Gilles
    Jaffrezic, Florence
    Rau, Andrea
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2015, 14 (05) : 413 - 428
  • [39] Stability-based model order selection in clustering with applications to gene expression data
    Roth, V
    Braun, ML
    Lange, T
    Buhmann, JM
    ARTIFICIAL NEURAL NETWORKS - ICANN 2002, 2002, 2415 : 607 - 612
  • [40] Ensemble clustering method based on the resampling similarity measure for gene expression data
    Kim, Seo Young
    Lee, Jae Won
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2007, 16 (06) : 539 - 564