Robust distributed modal regression for massive data

被引:33
|
作者
Wang, Kangning [1 ]
Li, Shaomin [2 ,3 ]
机构
[1] Shandong Technol & Business Univ, Sch Stat, Yantai, Peoples R China
[2] Beijing Normal Univ, Ctr Stat & Data Sci, Zhuhai, Peoples R China
[3] Peking Univ, Guanghua Sch Management, Beijing, Peoples R China
基金
中国博士后科学基金;
关键词
Massive data; Robustness; Communication-efficient; Modal regression; Variable selection; VARIABLE SELECTION; LIKELIHOOD; LASSO;
D O I
10.1016/j.csda.2021.107225
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Modal regression is a good alternative of the mean regression and likelihood based methods, because of its robustness and high efficiency. A robust communication-efficient distributed modal regression for the distributed massive data is proposed in this paper. Specifically, the global modal regression objective function is approximated by a surrogate one at the first machine, which relates to the local datasets only through gradients. Then the resulting estimator can be obtained at the first machine and other machines only need to calculate the gradients, which can significantly reduce the communication cost. Under mild conditions, the asymptotical properties are established, which show that the proposed estimator is statistically as efficient as the global modal regression estimator. What is more, as a specific application, a penalized robust communication-efficient distributed modal regression variable selection procedure is developed. Simulation results and real data analysis are also included to validate our method. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Distributed Penalized Modal Regression for Massive Data
    Jun Jin
    Shuangzhe Liu
    Tiefeng Ma
    Journal of Systems Science and Complexity, 2023, 36 : 798 - 821
  • [2] Distributed Penalized Modal Regression for Massive Data
    Jin Jun
    Liu Shuangzhe
    Ma Tiefeng
    JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2023, 36 (02) : 798 - 821
  • [3] Distributed Penalized Modal Regression for Massive Data
    JIN Jun
    LIU Shuangzhe
    MA Tiefeng
    Journal of Systems Science & Complexity, 2023, 36 (02) : 798 - 821
  • [4] Unified distributed robust regression and variable selection framework for massive data
    Wang, Kangning
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 186
  • [5] Optimal subsampling for modal regression in massive data
    Chao, Yue
    Huang, Lei
    Ma, Xuejun
    Sun, Jiajun
    METRIKA, 2024, 87 (04) : 379 - 409
  • [6] Optimal subsampling for modal regression in massive data
    Yue Chao
    Lei Huang
    Xuejun Ma
    Jiajun Sun
    Metrika, 2024, 87 : 379 - 409
  • [7] Distributed quantile regression for massive heterogeneous data
    Hu, Aijun
    Jiao, Yuling
    Liu, Yanyan
    Shi, Yueyong
    Wu, Yuanshan
    NEUROCOMPUTING, 2021, 448 : 249 - 262
  • [8] Robust communication-efficient distributed composite quantile regression and variable selection for massive data
    Wang, Kangning
    Li, Shaomin
    Zhang, Benle
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2021, 161
  • [9] Adaptive distributed support vector regression of massive data
    Liang, Shu-na
    Sun, Fei
    Zhang, Qi
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2024, 53 (09) : 3365 - 3382
  • [10] Distributed optimal subsampling for quantile regression with massive data
    Chao, Yue
    Ma, Xuejun
    Zhu, Boya
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2024, 233