BAYESIAN MODEL-BASED CLUSTERING FOR POPULATIONS OF NETWORK DATA

被引:2
|
作者
Mantziou, Anastasia [1 ]
Lunagomez, Simon [2 ]
Mitra, Robin [3 ]
机构
[1] Alan Turing Inst, Dept Finance & Econ, London, England
[2] ITAM, Dept Estadist, Mexico City, Mexico
[3] UCL, Dept Stat Sci, London, England
来源
ANNALS OF APPLIED STATISTICS | 2024年 / 18卷 / 01期
关键词
Key words and phrases. Bayesian models; clustering; mixture models; populations of network data; object data analysis; STATISTICAL-INFERENCE; BRAIN; CONNECTIVITY; IMPUTATION; MIXTURES; SYSTEM;
D O I
10.1214/23-AOAS1789
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
There is increasing appetite for analysing populations of network data due to the fast-growing body of applications demanding such methods. While methods exist to provide readily interpretable summaries of heterogeneous network populations, these are often descriptive or ad hoc, lacking any formal justification. In contrast, principled analysis methods often provide results difficult to relate back to the applied problem of interest. Motivated by two complementary applied examples, we develop a Bayesian framework to appropriately model complex heterogeneous network populations, while also allowing analysts to gain insights from the data and make inferences most relevant to their needs. The first application involves a study in computer science measuring human movements across a university. The second analyses data from neuroscience investigating relationships between different regions of the brain. While both applications entail analysis of a heterogeneous population of networks, network sizes vary considerably. We focus on the problem of clustering the elements of a network population, where each cluster is characterised by a network representative. We take advantage of the Bayesian machinery to simultaneously infer the cluster membership, the representatives, and the community structure of the representatives, thus allowing intuitive inferences to be made. The implementation of our method on the human movement study reveals interesting movement patterns of individuals in clusters, readily characterised by their network representative. For the brain networks application, our model reveals a cluster of individuals with different network properties of particular interest in neuroscience. The performance of our method is additionally validated in extensive simulation studies.
引用
收藏
页码:266 / 302
页数:37
相关论文
共 50 条
  • [21] Model-based clustering with missing not at random data
    Sportisse, Aude
    Marbac, Matthieu
    Laporte, Fabien
    Celeux, Gilles
    Boyer, Claire
    Josse, Julie
    Biernacki, Christophe
    STATISTICS AND COMPUTING, 2024, 34 (04)
  • [22] Model-based clustering and classification of functional data
    Chamroukhi, Faicel
    Nguyen, Hien D.
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2019, 9 (04)
  • [23] On model-based clustering of skewed matrix data
    Melnykov, Volodymyr
    Zhu, Xuwen
    JOURNAL OF MULTIVARIATE ANALYSIS, 2018, 167 : 181 - 194
  • [24] Model-based Clustering and Classification for Data Science
    Unwin, Antony
    INTERNATIONAL STATISTICAL REVIEW, 2020, 88 (01) : 263 - 264
  • [25] Model-based clustering of array CGH data
    Shah, Sohrab P.
    Cheung, K-John, Jr.
    Johnson, Nathalie A.
    Alain, Guillaume
    Gascoyne, Randy D.
    Horsman, Douglas E.
    Ng, Raymond T.
    Murphy, Kevin P.
    BIOINFORMATICS, 2009, 25 (12) : I30 - I38
  • [26] Model-based multidimensional clustering of categorical data
    Chen, Tao
    Zhang, Nevin L.
    Liu, Tengfei
    Poon, Kin Man
    Wang, Yi
    ARTIFICIAL INTELLIGENCE, 2012, 176 (01) : 2246 - 2269
  • [27] Model-Based Hierarchical Clustering for Categorical Data
    Alalyan, Fahdah
    Zamzami, Nuha
    Bouguila, Nizar
    2019 IEEE 28TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2019, : 1424 - 1429
  • [28] Model-based clustering for multivariate functional data
    Jacques, Julien
    Preda, Cristian
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 71 : 92 - 106
  • [29] Penalized model-based clustering of fMRI data
    Dilernia, Andrew
    Quevedo, Karina
    Camchong, Jazmin
    Lim, Kelvin
    Pan, Wei
    Zhang, Lin
    BIOSTATISTICS, 2022, 23 (03) : 825 - 843
  • [30] Bayesian model-based clustering of temporal gene expression using autoregressive panel data approach
    Nascimento, Moyses
    Safadi, Thelma
    Fonseca e Silva, Fabyano
    Nascimento, Ana Carolina C.
    BIOINFORMATICS, 2012, 28 (15) : 2004 - 2007