NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA

被引:55
|
作者
Sun, Hokeun [1 ]
Lin, Wei [2 ]
Feng, Rui [2 ]
Li, Hongzhe [2 ]
机构
[1] Pusan Natl Univ, Dept Stat, Pusan 609735, South Korea
[2] Univ Penn, Perelman Sch Med, Dept Biostat & Epidemiol, Philadelphia, PA 19104 USA
关键词
Laplacian penalty; network analysis; regularization; sparsity; survival data; variable selection; weak oracle property; PROPORTIONAL HAZARDS MODEL; NONCONCAVE PENALIZED LIKELIHOOD; VARIABLE SELECTION; DANTZIG SELECTOR; ADAPTIVE LASSO; EXPRESSION; METASTASIS; SHRINKAGE;
D O I
10.5705/ss.2012.317
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider estimation and variable selection in high-dimensional Cox regression when a prior knowledge of the relationships among the covariates, described by a network or graph, is available. A limitation of the existing methodology for survival analysis with high-dimensional genomic data is that a wealth of structural information about many biological processes, such as regulatory networks and pathways, has often been ignored. In order to incorporate such prior network information into the analysis of genomic data, we propose a network-based regularization method for high-dimensional Cox regression; it uses an l(1)-penalty to induce sparsity of the regression coefficients and a quadratic Laplacian penalty to encourage smoothness between the coefficients of neighboring variables on a given network. The proposed method is implemented by an efficient coordinate descent algorithm. In the setting where the dimensionality p can grow exponentially fast with the sample size n, we establish model selection consistency and estimation bounds for the proposed estimators. The theoretical results provide insights into the gain from taking into account the network structural information. Extensive simulation studies indicate that our method outperforms Lasso and elastic net in terms of variable selection accuracy and stability. We apply our method to a breast cancer gene expression study and identify several biologically plausible subnetworks and pathways that are associated with breast cancer distant metastasis.
引用
收藏
页码:1433 / 1459
页数:27
相关论文
共 50 条
  • [31] Robust Ridge Regression for High-Dimensional Data
    Maronna, Ricardo A.
    TECHNOMETRICS, 2011, 53 (01) : 44 - 53
  • [32] Multivariate Boosting for Integrative Analysis of High-Dimensional Cancer Genomic Data
    Xiong, Lie
    Kuan, Pei-Fen
    Tian, Jianan
    Keles, Sunduz
    Wang, Sijian
    CANCER INFORMATICS, 2014, 13 : 123 - 131
  • [33] Compositional knockoff filter for high-dimensional regression analysis of microbiome data
    Srinivasan, Arun
    Xue, Lingzhou
    Zhan, Xiang
    BIOMETRICS, 2021, 77 (03) : 984 - 995
  • [34] High-dimensional, massive sample-size Cox proportional hazards regression for survival analysis
    Mittal, Sushil
    Madigan, David
    Burd, Randall S.
    Suchard, Marc A.
    BIOSTATISTICS, 2014, 15 (02) : 207 - 221
  • [35] Regularized Sandwich Estimators for Analysis of High-Dimensional Data Using Generalized Estimating Equations
    Warton, David I.
    BIOMETRICS, 2011, 67 (01) : 116 - 123
  • [36] CLASSIFICATION OF HIGH-DIMENSIONAL DATA: A RANDOM-MATRIX REGULARIZED DISCRIMINANT ANALYSIS APPROACH
    Ye, Bin
    Liu, Peng
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2019, 15 (03): : 955 - 967
  • [37] Positive-definite regularized estimation for high-dimensional covariance on scalar regression
    He, Jie
    Qiu, Yumou
    Zhou, Xiao-Hua
    BIOMETRICS, 2025, 81 (01)
  • [38] L0-Regularized Learning for High-Dimensional Additive Hazards Regression
    Zheng, Zemin
    Zhang, Jie
    Li, Yang
    INFORMS JOURNAL ON COMPUTING, 2022, 34 (05) : 2762 - 2775
  • [39] REGULARIZED PROJECTION SCORE ESTIMATION OF TREATMENT EFFECTS IN HIGH-DIMENSIONAL QUANTILE REGRESSION
    Cheng, Chao
    Feng, Xingdong
    Huang, Jian
    Liu, Xu
    STATISTICA SINICA, 2022, 32 (01) : 23 - 41
  • [40] Hierarchical regularized regression for incorporating external information in high-dimensional prediction models
    Weaver, Garrett M.
    Lewinger, Juan Pablo
    GENETIC EPIDEMIOLOGY, 2018, 42 (07) : 740 - 740