NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA

被引:55
|
作者
Sun, Hokeun [1 ]
Lin, Wei [2 ]
Feng, Rui [2 ]
Li, Hongzhe [2 ]
机构
[1] Pusan Natl Univ, Dept Stat, Pusan 609735, South Korea
[2] Univ Penn, Perelman Sch Med, Dept Biostat & Epidemiol, Philadelphia, PA 19104 USA
关键词
Laplacian penalty; network analysis; regularization; sparsity; survival data; variable selection; weak oracle property; PROPORTIONAL HAZARDS MODEL; NONCONCAVE PENALIZED LIKELIHOOD; VARIABLE SELECTION; DANTZIG SELECTOR; ADAPTIVE LASSO; EXPRESSION; METASTASIS; SHRINKAGE;
D O I
10.5705/ss.2012.317
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider estimation and variable selection in high-dimensional Cox regression when a prior knowledge of the relationships among the covariates, described by a network or graph, is available. A limitation of the existing methodology for survival analysis with high-dimensional genomic data is that a wealth of structural information about many biological processes, such as regulatory networks and pathways, has often been ignored. In order to incorporate such prior network information into the analysis of genomic data, we propose a network-based regularization method for high-dimensional Cox regression; it uses an l(1)-penalty to induce sparsity of the regression coefficients and a quadratic Laplacian penalty to encourage smoothness between the coefficients of neighboring variables on a given network. The proposed method is implemented by an efficient coordinate descent algorithm. In the setting where the dimensionality p can grow exponentially fast with the sample size n, we establish model selection consistency and estimation bounds for the proposed estimators. The theoretical results provide insights into the gain from taking into account the network structural information. Extensive simulation studies indicate that our method outperforms Lasso and elastic net in terms of variable selection accuracy and stability. We apply our method to a breast cancer gene expression study and identify several biologically plausible subnetworks and pathways that are associated with breast cancer distant metastasis.
引用
收藏
页码:1433 / 1459
页数:27
相关论文
共 50 条
  • [41] Network-Regularized Sparse Logistic Regression Models for Clinical Risk Prediction and Biomarker Discovery
    Min, Wenwen
    Liu, Juan
    Zhang, Shihua
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018, 15 (03) : 944 - 953
  • [42] Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data
    Gui, J
    Li, HZ
    BIOINFORMATICS, 2005, 21 (13) : 3001 - 3008
  • [43] Integration of pathway structure information into a reweighted partial Cox regression approach for survival analysis on high-dimensional gene expression data
    Liu, Wei
    Wang, Qiuyu
    Zhao, Jianmei
    Zhang, Chunlong
    Liu, Yuejuan
    Zhang, Jian
    Bai, Xuefeng
    Li, Xuecang
    Feng, Houming
    Liao, Mingzhi
    Wang, Wei
    Li, Chunquan
    MOLECULAR BIOSYSTEMS, 2015, 11 (07) : 1876 - 1886
  • [44] Subgroup analysis for high-dimensional functional regression
    Zhang, Xiaochen
    Zhang, Qingzhao
    Ma, Shuangge
    Fang, Kuangnan
    JOURNAL OF MULTIVARIATE ANALYSIS, 2022, 192
  • [45] High-dimensional regression analysis with treatment comparisons
    Heng-Hui Lue
    Bing-Ran You
    Computational Statistics, 2013, 28 : 1299 - 1317
  • [46] High-dimensional regression analysis with treatment comparisons
    Lue, Heng-Hui
    You, Bing-Ran
    COMPUTATIONAL STATISTICS, 2013, 28 (03) : 1299 - 1317
  • [47] ENNS: Variable Selection, Regression, Classification and Deep Neural Network for High-Dimensional Data
    Yang, Kaixu
    Ganguli, Arkaprabha
    Maiti, Tapabrata
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [48] Robust high-dimensional regression for data with anomalous responses
    Mingyang Ren
    Sanguo Zhang
    Qingzhao Zhang
    Annals of the Institute of Statistical Mathematics, 2021, 73 : 703 - 736
  • [49] Quantile forward regression for high-dimensional survival data
    Lee, Eun Ryung
    Park, Seyoung
    Lee, Sang Kyu
    Hong, Hyokyoung G.
    LIFETIME DATA ANALYSIS, 2023, 29 (04) : 769 - 806
  • [50] Robust high-dimensional regression for data with anomalous responses
    Ren, Mingyang
    Zhang, Sanguo
    Zhang, Qingzhao
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2021, 73 (04) : 703 - 736