Hypothesis testing for automated community detection in networks

被引:112
|
作者
Bickel, Peter J. [1 ]
Sarkar, Purnamrita [2 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Univ Texas Austin, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
Asymptotic analysis; Community detection; Hypothesis testing; Networks; Stochastic block model; Tracy-Widom distribution; STOCHASTIC BLOCKMODELS; UNIVERSALITY; EIGENVALUES; MODEL; EDGE;
D O I
10.1111/rssb.12117
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Community detection in networks is a key exploratory tool with applications in a diverse set of areas, ranging from finding communities in social and biological networks to identifying link farms in the World Wide Web. The problem of finding communities or clusters in a network has received much attention from statistics, physics and computer science. However, most clustering algorithms assume knowledge of the number of clusters k. We propose to determine k automatically in a graph generated from a stochastic block model by using a hypothesis test of independent interest. Our main contribution is twofold; first, we theoretically establish the limiting distribution of the principal eigenvalue of the suitably centred and scaled adjacency matrix and use that distribution for our test of the hypothesis that a random graph is of Erdos-Renyi (noise) type. Secondly, we use this test to design a recursive bipartitioning algorithm, which naturally uncovers nested community structure. Using simulations and quantifiable classification tasks on real world networks with ground truth, we show that our algorithm outperforms state of the art methods.
引用
收藏
页码:253 / 273
页数:21
相关论文
共 50 条
  • [1] Spectral based hypothesis testing for community detection in complex networks
    Dong, Zhishan
    Wang, Shuangshuang
    Liu, Qun
    INFORMATION SCIENCES, 2020, 512 : 1360 - 1371
  • [2] Dissimilarity-based hypothesis testing for community detection in heterogeneous networks
    Xu, Xin-Jian
    Chen, Cheng
    Mendes, J. F. F.
    FRONTIERS IN PHYSICS, 2023, 11
  • [3] A HYPOTHESIS TESTING FRAMEWORK FOR MODULARITY BASED NETWORK COMMUNITY DETECTION
    Zhang, Jingfei
    Chen, Yuguo
    STATISTICA SINICA, 2017, 27 (01) : 437 - 456
  • [4] Water Pollution Detection Based on Hypothesis Testing in Sensor Networks
    Luo, Xu
    Yang, Jun
    JOURNAL OF SENSORS, 2017, 2017
  • [5] BAYESIAN MULTIPLE HYPOTHESIS TESTING FOR DISTRIBUTED DETECTION IN SENSOR NETWORKS
    Halme, Topi
    Golz, Martin
    Koivunen, Visa
    2019 IEEE DATA SCIENCE WORKSHOP (DSW), 2019, : 105 - 109
  • [6] Distributed Sequential Detection for Gaussian Binary Hypothesis Testing : Heterogeneous Networks
    Sahu, Anit Kumar
    Kar, Soummya
    CONFERENCE RECORD OF THE 2014 FORTY-EIGHTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2014, : 723 - 727
  • [7] Hypothesis testing for populations of networks
    Chen, Li
    Zhou, Jie
    Lin, Lizhen
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2023, 52 (11) : 3661 - 3684
  • [8] Minimax Rates in Network Analysis: Graphon Estimation, Community Detection and Hypothesis Testing
    Gao, Chao
    Ma, Zongming
    STATISTICAL SCIENCE, 2021, 36 (01) : 16 - 33
  • [9] Sybil Attack Detection using Sequential Hypothesis Testing in Wireless Sensor Networks
    Vamsi, P. Raghu
    Kant, Krishna
    2014 INTERNATIONAL CONFERENCE ON SIGNAL PROPAGATION AND COMPUTER TECHNOLOGY (ICSPCT 2014), 2014, : 698 - 702
  • [10] Hypothesis testing and decision theoretic approach for fault detection in wireless sensor networks
    Nandi, Mrinal
    Nayak, Amiya
    Roy, Bimal
    Sarkar, Santanu
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2015, 30 (04) : 262 - 285