Validating clustering for gene expression data

被引:443
|
作者
Yeung, KY [1 ]
Haynor, DR [1 ]
Ruzzo, WL [1 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/bioinformatics/17.4.309
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Many clustering algorithms have been proposed for the analysis of gene expression data, but little guidance is available to help choose among them. We provide a systematic framework for assessing the results of clustering algorithms. Clustering algorithms attempt to partition the genes into groups exhibiting similar patterns of variation in expression level. Our methodology is to apply a clustering algorithm to the data from all but one experimental condition. The remaining condition is used to assess the predictive power of the resulting clusters-meaningful clusters should exhibit less variation in the remaining condition than clusters formed by chance. Results: We successfully applied our methodology to compare six clustering algorithms on four gene expression data sets. We found our quantitative measures of cluster quality to be positively correlated with external standards of cluster quality.
引用
收藏
页码:309 / 318
页数:10
相关论文
共 50 条
  • [1] Validating Clusterings of Gene Expression Data
    De Mulder, Wim
    Boel, Rene
    Kuiper, Martin
    2010 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2010), VOL 1, 2010, : 241 - 245
  • [2] Hierarchical clustering of gene expression data
    Luo, F
    Tang, K
    Khan, L
    THIRD IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING - BIBE 2003, PROCEEDINGS, 2003, : 328 - 335
  • [3] An Incremental Clustering of Gene Expression data
    Das, Rosy
    Bhattacharyya, Dhruba K.
    Kalita, Jugal K.
    2009 WORLD CONGRESS ON NATURE & BIOLOGICALLY INSPIRED COMPUTING (NABIC 2009), 2009, : 741 - +
  • [4] Clustering analysis for gene expression data
    Chen, YD
    Ermolaeva, O
    Bittner, M
    Meltzer, P
    Trent, J
    Dougherty, ER
    Batman, S
    ADVANCES IN FLUORESCENCE SENSING TECHNOLOGY IV, PROCEEDINGS OF, 1999, 3602 : 422 - 428
  • [5] Techniques for clustering gene expression data
    Kerr, G.
    Ruskin, H. J.
    Crane, M.
    Doolan, P.
    COMPUTERS IN BIOLOGY AND MEDICINE, 2008, 38 (03) : 283 - 293
  • [6] Fuzzy clustering of gene expression data
    Futschik, ME
    Kasabov, NK
    PROCEEDINGS OF THE 2002 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOL 1 & 2, 2002, : 414 - 419
  • [7] Incorporating gene ontology in clustering gene expression data
    Kustra, Rafal
    Zagdanski, Adam
    19TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2006, : 555 - +
  • [8] Problems in gene clustering based on gene expression data
    Bryan, J
    JOURNAL OF MULTIVARIATE ANALYSIS, 2004, 90 (01) : 44 - 66
  • [9] Clustering cancer gene expression data by projective clustering ensemble
    Yu, Xianxue
    Yu, Guoxian
    Wang, Jun
    PLOS ONE, 2017, 12 (02):
  • [10] Analysis of gene expression data: clustering and beyond
    Zohar Yakhini
    Amir Ben-Dor
    Stuart Kim
    Ron Shamir
    Nature Genetics, 1999, 23 (Suppl 3) : 83 - 83