Community detection in general stochastic block models: fundamental limits and efficient algorithms for recovery

被引:208
|
作者
Abbe, Emmanuel [1 ]
Sandon, Colin [2 ]
机构
[1] Princeton Univ, PACM & EE Dept, Princeton, NJ 08544 USA
[2] Princeton Univ, Dept Math, Princeton, NJ 08544 USA
关键词
Community detection; stochastic block models; phase transitions; clustering algorithms; information measures; graph-based codes; BLOCKMODELS; GRAPHS;
D O I
10.1109/FOCS.2015.47
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
New phase transition phenomena have recently been discovered for the stochastic block model, for the special case of two non-overlapping symmetric communities. This gives raise in particular to new algorithmic challenges driven by the thresholds. This paper investigates whether a general phenomenon takes place for multiple communities, without imposing symmetry. In the general stochastic block model SBM(n,p, W), n vertices are split into k communities of relative size {p(i)}(i is an element of[k]), and vertices in community i and j connect independently with probability {Wi,j}i,je [k]. This paper investigates the partial and exact recovery of communities in the general SBM (in the constant and logarithmic degree regimes), and uses the generality of the results to tackle overlapping communities. The contributions of the paper are: (i) an explicit characterization of the recovery threshold in the general SBM in terms of a new f-divergence function D+, which generalizes the Hellinger and Chernoff divergences, and which provides an operational meaning to a divergence function analog to the KL-divergence in the channel coding theorem, (ii) the development of an algorithm that recovers the communities all the way down to the optimal threshold and runs in quasi-linear time, showing that exact recovery has no information-theoretic to computational gap for multiple communities, (iii) the development of an efficient algorithm that detects communities in the constant degree regime with an explicit accuracy bound that can be made arbitrarily close to 1 when a prescribed signal-to-noise ratio (defined in term of the spectrum of diag(p)W) tends to infinity.
引用
收藏
页码:670 / 688
页数:19
相关论文
共 50 条
  • [21] A scalable community detection algorithm for large graphs using stochastic block models
    Peng, Chengbin
    Zhang, Zhihua
    Wong, Ka-Chun
    Zhang, Xiangliang
    Keyes, David E.
    INTELLIGENT DATA ANALYSIS, 2017, 21 (06) : 1463 - 1485
  • [22] Dynamic stochastic block models: parameter estimation and detection of changes in community structure
    Matthew Ludkin
    Idris Eckley
    Peter Neal
    Statistics and Computing, 2018, 28 : 1201 - 1213
  • [23] Graph Theoretic and Stochastic Block Models Integrated with Matrix Factorization for Community Detection
    McGarry, Ken
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2022, 2024, 1454 : 297 - 311
  • [24] Dynamic stochastic block models: parameter estimation and detection of changes in community structure
    Ludkin, Matthew
    Eckley, Idris
    Neal, Peter
    STATISTICS AND COMPUTING, 2018, 28 (06) : 1201 - 1213
  • [25] Rate optimal Chernoff bound and application to community detection in the stochastic block models
    Zhou, Zhixin
    Li, Ping
    ELECTRONIC JOURNAL OF STATISTICS, 2020, 14 (01): : 1302 - 1347
  • [26] A Scalable Community Detection Algorithm for Large Graphs Using Stochastic Block Models
    Peng, Chengbin
    Zhang, Zhihua
    Wong, Ka-Chun
    Zhang, Xiangliang
    Keyes, David E.
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 2090 - 2096
  • [27] Stochastic integer programming:General models and algorithms
    Willem K. Klein Haneveld
    Maarten H. van der Vlerk
    Annals of Operations Research, 1999, 85 : 39 - 57
  • [28] Stochastic integer programming: General models and algorithms
    Haneveld, WKK
    van der Vlerk, MH
    ANNALS OF OPERATIONS RESEARCH, 1999, 85 (0) : 39 - 57
  • [29] Exact Recovery in the General Hypergraph Stochastic Block Model
    Zhang, Qiaosheng
    Tan, Vincent Y. F.
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2023, 69 (01) : 453 - 471
  • [30] Efficient Near-Optimal Testing of Community Changes in Balanced Stochastic Block Models
    Gangrade, Aditya
    Venkatesh, Praveen
    Nazer, Bobak
    Saligrama, Venkatesh
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32