DISTRIBUTED SUFFICIENT DIMENSION REDUCTION FOR HETEROGENEOUS MASSIVE DATA

被引:4
|
作者
Xu, Kelin [1 ]
Zhu, Liping [2 ,3 ]
Fan, Jianqing [4 ]
机构
[1] Fudan Univ, Sch Publ Hlth, Shanghai, Peoples R China
[2] Renmin Univ China, Ctr Appl Stat, Beijing, Peoples R China
[3] Renmin Univ China, Inst Stat & Big Data, Beijing, Peoples R China
[4] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ USA
基金
北京市自然科学基金;
关键词
Cumulative slicing estimation; distributed estimation; het-erogeneity; sliced inverse regression; sufficient dimension reduction; SLICED INVERSE REGRESSION; CONFIDENCE-INTERVALS; ASYMPTOTICS;
D O I
10.5705/ss.202021.0031
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose a distributed sufficient dimension reduction to process massive data characterized by high dimensionality, a huge sample size, and heterogeneity (heterogeneity, and huge sample sizes). To address the high dimensionality, we replace the high-dimensional explanatory variables with a small number of linear projections that are sufficient to explain the variabilities of the response variable. We allow for distinctive function maps for data scattered at different locations, thus addressing the problem of heterogeneity. We assume that the dimension reduction subspaces at different local nodes are identical. This allows us to aggregate the local results obtained from each local node to yield a final estimate on a central server. We explicitly examine the sliced inverse regression and cumulative slicing estimation, and investigate the nonasymptotic error bounds of the resulting dimensionality reduction. Our theoretical results are further supported by simulation studies and an application to meta-genome data from the American Gut Project.
引用
收藏
页码:2455 / 2476
页数:22
相关论文
共 50 条
  • [31] EFFICIENT ESTIMATION IN SUFFICIENT DIMENSION REDUCTION
    Ma, Yanyuan
    Zhu, Liping
    ANNALS OF STATISTICS, 2013, 41 (01): : 250 - 268
  • [32] Sufficient dimension reduction with missing predictors
    Li, Lexin
    Lu, Wenbin
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (482) : 822 - 831
  • [33] Sufficient dimension reduction with additional information
    Hung, Hung
    Liu, Chih-Yen
    Lu, Henry Horng-Shing
    BIOSTATISTICS, 2016, 17 (03) : 405 - 421
  • [34] Sufficient Dimension Reduction for Censored Predictors
    Tomassi, Diego
    Forzani, Liliana
    Bura, Efstathia
    Pfeiffer, Ruth
    BIOMETRICS, 2017, 73 (01) : 220 - 231
  • [35] On hierarchical clustering in sufficient dimension reduction
    Yoo, Chaeyeon
    Yoo, Younju
    Um, Hye Yeon
    Yoo, Jae Keun
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2020, 27 (04) : 431 - 443
  • [36] A unified approach to sufficient dimension reduction
    Xue, Yuan
    Wang, Qin
    Yin, Xiangrong
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2018, 197 : 168 - 179
  • [37] Diagnostic studies in sufficient dimension reduction
    Chen, Xin
    Cook, R. Dennis
    Zou, Changliang
    BIOMETRIKA, 2015, 102 (03) : 545 - 558
  • [38] Sparse sufficient dimension reduction with heteroscedasticity
    Cheng, Haoyang
    Cui, Wenquan
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2022, 20 (01)
  • [39] Sufficient dimension reduction and prediction in regression
    Adragni, Kofi P.
    Cook, R. Dennis
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2009, 367 (1906): : 4385 - 4405
  • [40] DEEP NONLINEAR SUFFICIENT DIMENSION REDUCTION
    Chen, YinFeng
    Jiao, YuLing
    Qiu, Rui
    Hu, Zhou
    ANNALS OF STATISTICS, 2024, 52 (03): : 1201 - 1226