Complementary hierarchical clustering

被引:20
|
作者
Nowak, Gen [1 ]
Tibshirani, Robert [1 ,2 ]
机构
[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Hlth Res & Policy, Stanford, CA 94305 USA
关键词
hierarchical clustering; microarray; principal components; relative gene importance;
D O I
10.1093/biostatistics/kxm046
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
When applying hierarchical clustering algorithms to cluster patient samples from microarray data, the clustering patterns generated by most algorithms tend to be dominated by groups of highly differentially expressed genes that have closely related expression patterns. Sometimes, these genes may not be relevant to the biological process under study or their functions may already be known. The problem is that these genes can potentially drown out the effects of other genes that are relevant or have novel functions. We propose a procedure called complementary hierarchical clustering that is designed to uncover the structures arising from these novel genes that are not as highly expressed. Simulation studies show that the procedure is effective when applied to a variety of examples. We also define a concept called relative gene importance that can be used to identify the influential genes in a given clustering. Finally, we analyze a microarray data set from 295 breast cancer patients, using clustering with the correlation-based distance measure. The complementary clustering reveals a grouping of the patients which is uncorrelated with a number of known prognostic signatures and significantly differing distant metastasis-free probabilities.
引用
收藏
页码:467 / 483
页数:17
相关论文
共 50 条
  • [31] Convex Clustering: An Attractive Alternative to Hierarchical Clustering
    Chen, Gary K.
    Chi, Eric C.
    Ranola, John Michael O.
    Lange, Kenneth
    PLOS COMPUTATIONAL BIOLOGY, 2015, 11 (05)
  • [32] Complementary ensemble clustering of biomedical data
    Fodeh, Samah Jamal
    Brandt, Cynthia
    Thai Binh Luong
    Haddad, Ali
    Schultz, Martin
    Murphy, Terrence
    Krauthammer, Michael
    JOURNAL OF BIOMEDICAL INFORMATICS, 2013, 46 (03) : 436 - 443
  • [33] Segmentation and clustering as complementary sources of information
    Dale, Michael B.
    Allison, Lloyd
    Dale, Patricia E. R.
    ACTA OECOLOGICA-INTERNATIONAL JOURNAL OF ECOLOGY, 2007, 31 (02): : 193 - 202
  • [34] Hierarchical constraints: Providing structural bias for hierarchical clustering
    Bade K.
    Nürnberger A.
    Machine Learning, 2014, 94 (3) : 371 - 399
  • [35] Hierarchical Business Process Clustering
    Jung, Jae-Yoon
    Bae, Joonsoo
    Liu, Ling
    2008 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING, PROCEEDINGS, VOL 2, 2008, : 613 - +
  • [36] Hierarchical IP flow clustering
    Shadi, Kamal
    Natarajan, Preethi
    Dovrolis, Constantine
    BIG-DAMA '17: PROCEEDINGS OF THE 2017 WORKSHOP ON BIG DATA ANALYTICS AND MACHINE LEARNING FOR DATA COMMUNICATION NETWORKS, 2017, : 25 - 30
  • [37] The isolation approach to hierarchical clustering
    Gregorius, HR
    JOURNAL OF CLASSIFICATION, 2004, 21 (01) : 51 - 69
  • [38] HIERARCHICAL AGGLOMERATIVE CLUSTERING PROCEDURE
    LUKASOVA, A
    PATTERN RECOGNITION, 1979, 11 (5-6) : 365 - 381
  • [39] A New Hierarchical Clustering Algorithm
    Starczewski, Artur
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2012, 7268 : 175 - 180
  • [40] Scalable Hierarchical Agglomerative Clustering
    Monath, Nicholas
    Dubey, Kumar Avinava
    Guruganesh, Guru
    Zaheer, Manzil
    Ahmed, Amr
    McCallum, Andrew
    Mergen, Gokhan
    Najork, Marc
    Terzihan, Mert
    Tjanaka, Bryon
    Wang, Yuan
    Wu, Yuchen
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 1245 - 1255