Statistical Inference for Cluster Trees

被引:0
|
作者
Kim, Jisu [1 ]
Chen, Yen-Chi [2 ]
Balakrishnan, Sivaraman [1 ]
Rinaldo, Alessandro [1 ]
Wasserman, Larry [1 ]
机构
[1] Carnegie Mellon Univ, Dept Stat, Pittsburgh, PA 15213 USA
[2] Univ Washington, Dept Stat, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A cluster tree provides a highly-interpretable summary of a density function by representing the hierarchy of its high-density clusters. It is estimated using the empirical tree, which is the cluster tree constructed from a density estimator. This paper addresses the basic question of quantifying our uncertainty by assessing the statistical significance of topological features of an empirical cluster tree. We first study a variety of metrics that can be used to compare different trees, analyze their properties and assess their suitability for inference. We then propose methods to construct and summarize confidence sets for the unknown true cluster tree. We introduce a partial ordering on cluster trees which we use to prune some of the statistically insignificant features of the empirical tree, yielding interpretable and parsimonious cluster trees. Finally, we illustrate the proposed methods on a variety of synthetic examples and furthermore demonstrate their utility in the analysis of a Graft-versus-Host Disease (GvHD) data set.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] CLUSTER-BLOC ANALYSIS AND STATISTICAL INFERENCE
    WILLETTS, P
    AMERICAN POLITICAL SCIENCE REVIEW, 1972, 66 (02) : 569 - &
  • [2] Statistical inference in fuzzy cluster analysis of functional MRI
    Jahanian, H
    Soltanian-Zadeh, H
    Hossein-Zadeh, GA
    Seventh IASTED International Conference on Signal and Image Processing, 2005, : 133 - 135
  • [3] Test Statistics and Statistical Inference for Data With Informative Cluster Sizes
    Kim, Soyoung
    Martens, Michael J.
    Ahn, Kwang Woo
    BIOMETRICAL JOURNAL, 2025, 67 (01)
  • [4] Effects of randomization methods on statistical inference in disease cluster detection
    McLaughlin, Colleen C.
    Boscoe, Francis P.
    HEALTH & PLACE, 2007, 13 (01) : 152 - 163
  • [5] Leveraging cluster backbones for improving MAP inference in statistical relational models
    Ibrahim, Mohamed-Hamza
    Pal, Christopher
    Pesant, Gilles
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2020, 88 (08) : 907 - 949
  • [6] Leveraging cluster backbones for improving MAP inference in statistical relational models
    Mohamed-Hamza Ibrahim
    Christopher Pal
    Gilles Pesant
    Annals of Mathematics and Artificial Intelligence, 2020, 88 : 907 - 949
  • [7] The Inference of Gene Trees with Species Trees
    Szoellosi, Gergely J.
    Tannier, Eric
    Daubin, Vincent
    Boussau, Bastien
    SYSTEMATIC BIOLOGY, 2015, 64 (01) : E42 - E62
  • [8] Bayesian inference for multiband image segmentation via model-based cluster trees
    Murtagh, F
    Raftery, AE
    Starck, JL
    IMAGE AND VISION COMPUTING, 2005, 23 (06) : 587 - 596
  • [9] Statistical inference in the presence of imputed survey data through regression trees and random forests
    Dagdoug, Mehdi
    Goga, Camelia
    Haziza, David
    SCANDINAVIAN JOURNAL OF STATISTICS, 2025,
  • [10] Robust inference of trees
    Zaffalon, M
    Hutter, M
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2005, 45 (1-2) : 215 - 239