Efficient and doubly-robust methods for variable selection and parameter estimation in longitudinal data analysis

被引:1
|
作者
Fu, Liya [1 ]
Yang, Zhuoran [1 ]
Cai, Fengjing [2 ]
Wang, You-Gan [3 ]
机构
[1] Xi An Jiao Tong Univ, Sch Math & Stat, Xian, Peoples R China
[2] Wenzhou Univ, Coll Math, Wenzhou, Peoples R China
[3] Queensland Univ Technol, Sch Math Sci, Brisbane, Qld, Australia
基金
澳大利亚研究理事会; 美国国家科学基金会;
关键词
Correlated data; Outliers; Rank-based method; Variable selection; GENERALIZED ESTIMATING EQUATIONS; NONCONCAVE PENALIZED LIKELIHOOD; DIVERGING NUMBER; RANK REGRESSION; MODEL SELECTION; MIXED MODELS; DISPERSION;
D O I
10.1007/s00180-020-01038-3
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
New technologies have produced increasingly complex and massive datasets, such as next generation sequencing and microarray data in biology, dynamic treatment regimes in clinical trials and long-term wide-scale studies in the social sciences. Each study exhibits its unique data structure within individuals, clusters and possibly across time and space. In order to draw valid conclusion from such large dimensional data, we must account for intracluster correlations, varying cluster sizes, and outliers in response and/or covariate domains to achieve valid and efficient inferences. A weighted rank-based method is proposed for selecting variables and estimating parameters simultaneously. The main contribution of the proposed method is four fold: (1) variable selection using adaptive lasso is extended to robust rank regression so that protection against outliers in both response and predictor variables is obtained; (2) within-subject correlations are incorporated so that efficiency of parameter estimation is improved; (3) the computation is convenient via the existing function in statistical software R. (4) the proposed method is proved to have desirable asymptotic properties for fixed number of covariates (p). Simulation studies are carried out to evaluate the proposed method for a number of scenarios including the cases whenpequals to the number of subjects. The simulation results indicate that the proposed method is efficient and robust. A hormone dataset is analyzed for illustration. By adding additional redundant variables as covariates, the penalty approach and weighting schemes are proven to be effective.
引用
收藏
页码:781 / 804
页数:24
相关论文
共 50 条
  • [21] Efficient and doubly robust estimation in covariate-missing data problems
    Zhang, Biao
    JOURNAL OF STATISTICS & MANAGEMENT SYSTEMS, 2015, 18 (03): : 213 - 250
  • [22] Doubly robust estimation of generalized partial linear models for longitudinal data with dropouts
    Lin, Huiming
    Fu, Bo
    Qin, Guoyou
    Zhu, Zhongyi
    BIOMETRICS, 2017, 73 (04) : 1132 - 1139
  • [23] Estimation After Parameter Selection: Performance Analysis and Estimation Methods
    Routtenberg, Tirza
    Tong, Lang
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2016, 64 (20) : 5268 - 5281
  • [24] Improving Neural Network Generalization on Data-limited Regression with Doubly-Robust Boosting
    Wang, Hao
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 18, 2024, : 20821 - 20829
  • [25] Robust and smoothing variable selection for quantile regression models with longitudinal data
    Fu, Z. C.
    Fu, L. Y.
    Song, Y. N.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2023, 93 (15) : 2600 - 2624
  • [26] Robust variable selection in semiparametric mixed effects longitudinal data models
    Sun, Huihui
    Liu, Qiang
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2024, 53 (03) : 1049 - 1064
  • [27] PARAMETER SELECTION AND EFFICIENT DATA-PROCESSING METHODS IN A VARIABLE-BASE ANTENNA TESTING RANGE
    ASATRYAN, DG
    TATEVOSYAN, LA
    MEASUREMENT TECHNIQUES USSR, 1991, 34 (02): : 197 - 200
  • [28] Doubly robust estimation, optimally truncated inverse-intensity weighting and increment-based methods for the analysis of irregularly observed longitudinal data
    Pullenayegum, Eleanor M.
    Feldman, Brian M.
    STATISTICS IN MEDICINE, 2013, 32 (06) : 1054 - 1072
  • [29] Variable Selection Methods in Spectral Data Analysis
    Li Yan-kun
    Dong Ru-nan
    Zhang Jin
    Huang Ke-nan
    Mao Zhi-yi
    SPECTROSCOPY AND SPECTRAL ANALYSIS, 2021, 41 (11) : 3331 - 3338
  • [30] Variable selection in semiparametric regression analysis for longitudinal data
    Peixin Zhao
    Liugen Xue
    Annals of the Institute of Statistical Mathematics, 2012, 64 : 213 - 231