Distributed Bayesian Inference in Massive Spatial Data

被引:3
|
作者
Guhaniyogi, Rajarshi [1 ]
Li, Cheng [2 ]
Savitsky, Terrance [3 ]
Srivastava, Sanvesh [4 ]
机构
[1] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
[2] Natl Univ Singapore, Dept Stat & Data Sci, Singapore, Singapore
[3] US Bur Lab Stat, Washington, DC 20212 USA
[4] Univ Iowa, Dept Stat & Actuarial Sci, Iowa City, IA 52240 USA
基金
美国国家科学基金会;
关键词
Distributed Bayesian inference; Gaussian process; low-rank Gaussian process; massive spatial data; Wasserstein barycenter; GAUSSIAN PROCESS MODELS; DIVIDE-AND-CONQUER; APPROXIMATION; RATES; LIKELIHOODS; PREDICTION; REGRESSION; CLUSTERS; FIELDS;
D O I
10.1214/22-STS868
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Gaussian process (GP) regression is computationally expensive in spatial applications involving massive data. Various methods address this limitation, including a small number of Bayesian methods based on dis-tributed computations (or the divide-and-conquer strategy). Focusing on the latter literature, we achieve three main goals. First, we develop an extensible Bayesian framework for distributed spatial GP regression that embeds many popular methods. The proposed framework has three steps that partition the entire data into many subsets, apply a readily available Bayesian spatial pro-cess model in parallel on all the subsets, and combine the posterior distri-butions estimated on all the subsets into a pseudo posterior distribution that conditions on the entire data. The combined pseudo posterior distribution replaces the full data posterior distribution in prediction and inference prob-lems. Demonstrating our framework's generality, we extend posterior com-putations for (nondistributed) spatial process models with a stationary full -rank and a nonstationary low-rank GP priors to the distributed setting. Sec-ond, we contrast the empirical performance of popular distributed approaches with some widely-used, nondistributed alternatives and highlight their rela-tive advantages and shortcomings. Third, we provide theoretical support for our numerical observations and show that the Bayes L2-risks of the combined posterior distributions obtained from a subclass of the divide-and-conquer methods achieves the near-optimal convergence rate in estimating the true spatial surface with various types of covariance functions. Additionally, we provide upper bounds on the number of subsets to achieve these near-optimal rates.
引用
收藏
页码:262 / 284
页数:23
相关论文
共 50 条
  • [21] A distributed learning algorithm for Bayesian inference networks
    Lam, W
    Segre, AM
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2002, 14 (01) : 93 - 105
  • [22] Node aggregation for distributed inference in bayesian networks
    1600, Morgan Kaufmann Publ Inc, San Mateo, CA, USA (01):
  • [23] Distributed Bayesian Inference Over Sensor Networks
    Ye, Baijia
    Qin, Jiahu
    Fu, Weiming
    Zhu, Yingda
    Wang, Yaonan
    Kang, Yu
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (03) : 1587 - 1597
  • [24] Streaming, Distributed Variational Inference for Bayesian Nonparametrics
    Campbell, Trevor
    Straub, Julian
    Fisher, John W., III
    How, Jonathan P.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [25] Bayesian Bootstraps for Massive Data
    Barrientos, Andres F.
    Pena, Victor
    BAYESIAN ANALYSIS, 2020, 15 (02): : 363 - 388
  • [26] Implementation of Bayesian inference in distributed neural networks
    Yu, Zhaofei
    Hang, Tiejun
    Liu, Jian K.
    2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 666 - 673
  • [27] Bayesian inference of the spatial distributions of material properties
    Vigliotti, A.
    Csanyi, G.
    Deshpande, V. S.
    JOURNAL OF THE MECHANICS AND PHYSICS OF SOLIDS, 2018, 118 : 74 - 97
  • [28] Selective Inference with Distributed Data
    Liu, Sifan
    Panigrahi, Snigdha
    JOURNAL OF MACHINE LEARNING RESEARCH, 2025, 26
  • [29] The Influence of Timing and Spatial Parameters on Bayesian Inference
    Pinto Neto, Osmar
    Crespim, Leonardo
    Curty, Victor
    Kennedy, Deanna
    JOURNAL OF SPORT & EXERCISE PSYCHOLOGY, 2020, 42 : S52 - S52
  • [30] Bayesian Inference in Spatial Sample Selection Models
    Dogan, Osman
    Taspinar, Suleyman
    OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 2018, 80 (01) : 90 - 121