Robust distributed modal regression for massive data

被引:33
|
作者
Wang, Kangning [1 ]
Li, Shaomin [2 ,3 ]
机构
[1] Shandong Technol & Business Univ, Sch Stat, Yantai, Peoples R China
[2] Beijing Normal Univ, Ctr Stat & Data Sci, Zhuhai, Peoples R China
[3] Peking Univ, Guanghua Sch Management, Beijing, Peoples R China
基金
中国博士后科学基金;
关键词
Massive data; Robustness; Communication-efficient; Modal regression; Variable selection; VARIABLE SELECTION; LIKELIHOOD; LASSO;
D O I
10.1016/j.csda.2021.107225
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Modal regression is a good alternative of the mean regression and likelihood based methods, because of its robustness and high efficiency. A robust communication-efficient distributed modal regression for the distributed massive data is proposed in this paper. Specifically, the global modal regression objective function is approximated by a surrogate one at the first machine, which relates to the local datasets only through gradients. Then the resulting estimator can be obtained at the first machine and other machines only need to calculate the gradients, which can significantly reduce the communication cost. Under mild conditions, the asymptotical properties are established, which show that the proposed estimator is statistically as efficient as the global modal regression estimator. What is more, as a specific application, a penalized robust communication-efficient distributed modal regression variable selection procedure is developed. Simulation results and real data analysis are also included to validate our method. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] Quantile regression for massive data set
    Jiang, Rong
    Chen, Shi
    Wang, Fu-Ya
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024, 53 (12) : 5875 - 5883
  • [32] Robust reduced rank regression in a distributed setting
    Xi Chen
    Weidong Liu
    Xiaojun Mao
    Science China(Mathematics), 2022, 65 (08) : 1707 - 1730
  • [33] Robust reduced rank regression in a distributed setting
    Xi Chen
    Weidong Liu
    Xiaojun Mao
    Science China Mathematics, 2022, 65 : 1707 - 1730
  • [34] Robust reduced rank regression in a distributed setting
    Chen, Xi
    Liu, Weidong
    Mao, Xiaojun
    SCIENCE CHINA-MATHEMATICS, 2022, 65 (08) : 1707 - 1730
  • [35] DISTRIBUTED STATISTICAL INFERENCE FOR MASSIVE DATA
    Chen, Song Xi
    Peng, Liuhua
    ANNALS OF STATISTICS, 2021, 49 (05): : 2851 - 2869
  • [36] Modal regression for fixed effects panel data
    Ullah, Aman
    Wang, Tao
    Yao, Weixin
    EMPIRICAL ECONOMICS, 2021, 60 (01) : 261 - 308
  • [37] Modal regression for fixed effects panel data
    Aman Ullah
    Tao Wang
    Weixin Yao
    Empirical Economics, 2021, 60 : 261 - 308
  • [38] Robust Variable Selection and Estimation Based on Kernel Modal Regression
    Guo, Changying
    Song, Biqin
    Wang, Yingjie
    Chen, Hong
    Xiong, Huijuan
    ENTROPY, 2019, 21 (04)
  • [39] Distributed Least Product Relative Error estimation for semi-parametric multiplicative regression with massive data
    Zou, Yuhao
    Yuan, Xiaohui
    Liu, Tianqing
    INFORMATION SCIENCES, 2025, 691
  • [40] Asynchronous and Distributed Data Augmentation for Massive Data Settings
    Zhou, Jiayuan
    Khare, Kshitij
    Srivastava, Sanvesh
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2023, 32 (03) : 895 - 907