A novel non-negative Bayesian stacking modeling method for Cancer survival prediction using high-dimensional omics data

被引:0
|
作者
Shen, Junjie [1 ]
Wang, Shuo [2 ]
Sun, Hao [1 ]
Huang, Jie [1 ]
Bai, Lu [1 ]
Wang, Xichao [1 ]
Dong, Yongfei [1 ]
Tang, Zaixiang [1 ]
机构
[1] Soochow Univ, Sch Publ Hlth, Jiangsu Key Lab Prevent & Translat Med Major Chron, Dept Biostat,Suzhou Med Coll, Suzhou 215123, Jiangsu, Peoples R China
[2] Univ Freiburg, Inst Med Biometry & Stat, Fac Med & Med Ctr, D-79085 Freiburg, Germany
基金
中国国家自然科学基金;
关键词
Survival stacking; Non-negative Bayesian model; Artificial neural network; GENERALIZED LINEAR-MODELS; REGULARIZATION PATHS; REGRESSION SHRINKAGE; HUNTINGTON-DISEASE; SELECTION; LASSO; GENES;
D O I
10.1186/s12874-024-02232-3
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background Survival prediction using high-dimensional molecular data is a hot topic in the field of genomics and precision medicine, especially for cancer studies. Considering that carcinogenesis has a pathway-based pathogenesis, developing models using such group structures is a closer mimic of disease progression and prognosis. Many approaches can be used to integrate group information; however, most of them are single-model methods, which may account for unstable prediction.Methods We introduced a novel survival stacking method that modeled using group structure information to improve the robustness of cancer survival prediction in the context of high-dimensional omics data. With a super learner, survival stacking combines the prediction from multiple sub-models that are independently trained using the features in pre-grouped biological pathways. In addition to a non-negative linear combination of sub-models, we extended the super learner to non-negative Bayesian hierarchical generalized linear model and artificial neural network. We compared the proposed modeling strategy with the widely used survival penalized method Lasso Cox and several group penalized methods, e.g., group Lasso Cox, via simulation study and real-world data application.Results The proposed survival stacking method showed superior and robust performance in terms of discrimination compared with single-model methods in case of high-noise simulated data and real-world data. The non-negative Bayesian stacking method can identify important biological signal pathways and genes that are associated with the prognosis of cancer.Conclusions This study proposed a novel survival stacking strategy incorporating biological group information into the cancer prognosis models. Additionally, this study extended the super learner to non-negative Bayesian model and ANN, enriching the combination of sub-models. The proposed Bayesian stacking strategy exhibited favorable properties in the prediction and interpretation of complex survival data, which may aid in discovering cancer targets.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] A non-negative spike-and-slab lasso generalized linear stacking prediction modeling method for high-dimensional omics data
    Junjie Shen
    Shuo Wang
    Yongfei Dong
    Hao Sun
    Xichao Wang
    Zaixiang Tang
    BMC Bioinformatics, 25
  • [2] A non-negative spike-and-slab lasso generalized linear stacking prediction modeling method for high-dimensional omics data
    Shen, Junjie
    Wang, Shuo
    Dong, Yongfei
    Sun, Hao
    Wang, Xichao
    Tang, Zaixiang
    BMC BIOINFORMATICS, 2024, 25 (01)
  • [3] Hierarchical Bayesian Modeling of Mediation by High-Dimensional Omics Data
    Thomas, Duncan
    GENETIC EPIDEMIOLOGY, 2016, 40 (07) : 619 - 619
  • [4] Non-negative Constrained Penalty for High-Dimensional Correlated Data
    Ming, Hao
    Chen, Yinjun
    Yang, Hu
    COMMUNICATIONS IN MATHEMATICS AND STATISTICS, 2025,
  • [5] A Novel High-Dimensional Kernel Joint Non-Negative Matrix Factorization With Multimodal Information for Lung Cancer Study
    Shi, Yuhu
    Jin, Zhibin
    Deng, Jin
    Zeng, Weiming
    Zhou, Lili
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (02) : 976 - 987
  • [6] Penalized regression calibration: A method for the prediction of survival outcomes using complex longitudinal and high-dimensional data
    Signorelli, Mirko
    Spitali, Pietro
    Szigyarto, Cristina Al-Khalili
    Tsonaka, Roula
    STATISTICS IN MEDICINE, 2021, 40 (27) : 6178 - 6196
  • [7] Automated Survival Prediction in Metastatic Cancer Patients Using High-Dimensional Electronic Medical Record Data
    Gensheimer, Michael F.
    Henry, A. Solomon
    Wood, Douglas J.
    Hastie, Trevor J.
    Aggarwal, Sonya
    Dudley, Sara A.
    Pradhan, Pooja
    Banerjee, Imon
    Cho, Eunpi
    Ramchandran, Kavitha
    Pollom, Erqi
    Koong, Albert C.
    Rubin, Daniel L.
    Chang, Daniel T.
    JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2019, 111 (06): : 568 - 574
  • [8] Automated survival prediction in metastatic cancer patients using high-dimensional electronic medical record data
    Gensheimer, M. F.
    Henry, A. S.
    Wood, D. J.
    Hastie, T. J.
    Aggarwal, S.
    Dudley, S.
    Pradhan, P.
    Banerjee, I.
    Cho, E.
    Ramchandran, K.
    Pollom, E.
    Koong, A.
    Rubin, D.
    Chang, D. T.
    ANNALS OF ONCOLOGY, 2018, 29 : 548 - 548
  • [9] Unconstrained Non-negative Factorization of High-dimensional and Sparse Matrices in Recommender Systems
    Luo, Xin
    Zhou, MengChu
    2018 IEEE 14TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2018, : 1406 - 1413
  • [10] Iterative Bayesian Model Averaging: a method for the application of survival analysis to high-dimensional microarray data
    Annest, Amalia
    Bumgarner, Roger E.
    Raftery, Adrian E.
    Yeung, Ka Yee
    BMC BIOINFORMATICS, 2009, 10