Weighted rank aggregation of cluster validation measures: a Monte Carlo cross-entropy approach

被引:201
|
作者
Pihur, Vasyl [1 ]
Datta, Susmita [1 ]
Datta, Somnath [1 ]
机构
[1] Univ Louisville, Dept Bioinformat & Biostat, Louisville, KY 40202 USA
关键词
D O I
10.1093/bioinformatics/btm158
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Biologists often employ clustering techniques in the explorative phase of microarray data analysis to discover relevant biological groupings. Given the availability of numerous clustering algorithms in the machine-learning literature, an user might want to select one that performs the best for his/her data set or application. While various validation measures have been proposed over the years to judge the quality of clusters produced by a given clustering algorithm including their biological relevance, unfortunately, a given clustering algorithm can perform poorly under one validation measure while outperforming many other algorithms under another validation measure. A manual synthesis of results from multiple validation measures is nearly impossible in practice, especially, when a large number of clustering algorithms are to be compared using several measures. An automated and objective way of reconciling the rankings is needed. Results: Using a Monte Carlo cross-entropy algorithm, we successfully combine the ranks of a set of clustering algorithms under consideration via a weighted aggregation that optimizes a distance criterion. The proposed weighted rank aggregation allows for a far more objective and automated assessment of clustering results than a simple visual inspection. We illustrate our procedure using one simulated as well as three real gene expression data sets from various platforms where we rank a total of eleven clustering algorithms using a combined examination of 10 different validation measures. The aggregate rankings were found for a given number of clusters k and also for an entire range of k.
引用
收藏
页码:1607 / 1615
页数:9
相关论文
共 50 条
  • [1] CROSS-ENTROPY FOR MONTE-CARLO TREE SEARCH
    Chaslot, Guillaume M. J. B.
    Winands, Mark H. M.
    Szita, Istvan
    van den Herik, H. Jaap
    ICGA JOURNAL, 2008, 31 (03) : 145 - 156
  • [2] Rare Events via Cross-Entropy Population Monte Carlo
    Miller, Caleb
    Corcoran, Jem N.
    Schneider, Michael D.
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 439 - 443
  • [3] Reliability Evaluation for Microgrids Using Cross-Entropy Monte Carlo Simulation
    Hanna, Ryan
    Disfani, Vahid R.
    Kleissl, Jan
    2018 IEEE INTERNATIONAL CONFERENCE ON PROBABILISTIC METHODS APPLIED TO POWER SYSTEMS (PMAPS), 2018,
  • [4] Aggregation Cross-Entropy for Sequence Recognition
    Xie, Zecheng
    Huang, Yaoxiong
    Zhu, Yuanzhi
    Jin, Lianwen
    Liu, Yuliang
    Xie, Lele
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6531 - 6540
  • [6] Generating Capacity Reliability Evaluation Based on Monte Carlo Simulation and Cross-Entropy Methods
    Leite da Silva, Armando M.
    Fernandez, Reinaldo A. G.
    Singh, Chanan
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2010, 25 (01) : 129 - 137
  • [7] Composite Systems Reliability Evaluation Based on Monte Carlo Simulation and Cross-Entropy Methods
    Gonzalez-Fernandez, Reinaldo A.
    Leite da Silva, Armando M.
    Resende, Leonidas C.
    Schilling, Marcus T.
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2013, 28 (04) : 4598 - 4606
  • [8] A novel cross-entropy and entropy measures of IFSs and their applications
    Mao, Junjun
    Yao, Dengbao
    Wang, Cuicui
    KNOWLEDGE-BASED SYSTEMS, 2013, 48 : 37 - 45
  • [9] Maximal coverage problems with routing constraints using cross-entropy Monte Carlo tree search
    Pao-Te Lin
    Kuo-Shih Tseng
    Autonomous Robots, 2024, 48
  • [10] Maximal coverage problems with routing constraints using cross-entropy Monte Carlo tree search
    Lin, Pao-Te
    Tseng, Kuo-Shih
    AUTONOMOUS ROBOTS, 2024, 48 (01)