Communication-efficient estimation of quantile matrix regression for massive datasets

被引:2
|
作者
Yang, Yaohong [1 ]
Wang, Lei [1 ]
Liu, Jiamin [2 ]
Li, Rui [3 ]
Lian, Heng [4 ]
机构
[1] Nankai Univ, Sch Stat & Data Sci, KLMDASR, LEBPS & LPMC, Tianjin, Peoples R China
[2] Univ Sci & Technol Beijing, Sch Math & Phys, Beijing, Peoples R China
[3] Shanghai Univ Int Business & Econ, Sch Stat & Informat, Shanghai, Peoples R China
[4] City Univ Hong Kong, Dept Math, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Communication-efficient surrogate loss; Distributed estimator; Divide and conquer; Empirical processes; High-dimensional matrix regression; ALGORITHM;
D O I
10.1016/j.csda.2023.107812
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In modern scientific applications, more and more data sets contain natural matrix predictors and traditional regression methods are not directly applicable. Matrix regression has been adapted to such data structure and received increasing attention in recent years. In this paper, we consider estimation of the conditional quantile in high-dimensional regularized matrix regression with a nuclear norm penalty and establish the convergence rate of the estimator. In order to construct a quantile matrix regression estimator in the distributed setting or for massive data sets, we propose a regularized communication efficient surrogate loss (CSL) function. The proposed CSL method only needs the worker machines to compute the gradient based on local data and the central machine solves a regularized estimation problem. We prove that the estimation error based on the proposed CSL method matches the estimation error bound of the centralized method that analyzes the entire data set. An alternating direction method of multipliers algorithm is developed to efficiently obtain the distributed CSL estimator. The finite-sample performance of the proposed estimator is studied through simulations and an application to Beijing Air Quality data set.& COPY; 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Communication-efficient estimation of high-dimensional quantile regression
    Wang, Lei
    Lian, Heng
    ANALYSIS AND APPLICATIONS, 2020, 18 (06) : 1057 - 1075
  • [2] Robust communication-efficient distributed composite quantile regression and variable selection for massive data
    Wang, Kangning
    Li, Shaomin
    Zhang, Benle
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2021, 161
  • [3] Communication-Efficient Modeling with Penalized Quantile Regression for Distributed Data
    Hu, Aijun
    Li, Chujin
    Wu, Jing
    COMPLEXITY, 2021, 2021
  • [4] Communication-efficient sparse composite quantile regression for distributed data
    Yaohong Yang
    Lei Wang
    Metrika, 2023, 86 : 261 - 283
  • [5] Communication-efficient sparse composite quantile regression for distributed data
    Yang, Yaohong
    Wang, Lei
    METRIKA, 2023, 86 (03) : 261 - 283
  • [6] Communication-Efficient Nonparametric Quantile Regression via Random Features
    Wang, Caixing
    Li, Tao
    Zhang, Xinyi
    Feng, Xingdong
    He, Xin
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2024, 33 (04) : 1175 - 1184
  • [7] Communication-Efficient Modeling with Penalized Quantile Regression for Distributed Data
    Hu, Aijun
    Li, Chujin
    Wu, Jing
    Complexity, 2021, 2021
  • [8] Composite quantile regression for massive datasets
    Jiang, Rong
    Hu, Xueping
    Yu, Keming
    Qian, Weimin
    STATISTICS, 2018, 52 (05) : 980 - 1004
  • [9] Communication-efficient estimation and inference for high-dimensional quantile regression based on smoothed decorrelated score
    Di, Fengrui
    Wang, Lei
    Lian, Heng
    STATISTICS IN MEDICINE, 2022, 41 (25) : 5084 - 5101
  • [10] Communication-efficient surrogate quantile regression for non-randomly distributed system
    Wang, Kangning
    Zhang, Benle
    Alenezi, Fayadh
    Li, Shaomin
    INFORMATION SCIENCES, 2022, 588 : 425 - 441