A novel non-negative Bayesian stacking modeling method for Cancer survival prediction using high-dimensional omics data

被引:0
|
作者
Shen, Junjie [1 ]
Wang, Shuo [2 ]
Sun, Hao [1 ]
Huang, Jie [1 ]
Bai, Lu [1 ]
Wang, Xichao [1 ]
Dong, Yongfei [1 ]
Tang, Zaixiang [1 ]
机构
[1] Soochow Univ, Sch Publ Hlth, Jiangsu Key Lab Prevent & Translat Med Major Chron, Dept Biostat,Suzhou Med Coll, Suzhou 215123, Jiangsu, Peoples R China
[2] Univ Freiburg, Inst Med Biometry & Stat, Fac Med & Med Ctr, D-79085 Freiburg, Germany
基金
中国国家自然科学基金;
关键词
Survival stacking; Non-negative Bayesian model; Artificial neural network; GENERALIZED LINEAR-MODELS; REGULARIZATION PATHS; REGRESSION SHRINKAGE; HUNTINGTON-DISEASE; SELECTION; LASSO; GENES;
D O I
10.1186/s12874-024-02232-3
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background Survival prediction using high-dimensional molecular data is a hot topic in the field of genomics and precision medicine, especially for cancer studies. Considering that carcinogenesis has a pathway-based pathogenesis, developing models using such group structures is a closer mimic of disease progression and prognosis. Many approaches can be used to integrate group information; however, most of them are single-model methods, which may account for unstable prediction.Methods We introduced a novel survival stacking method that modeled using group structure information to improve the robustness of cancer survival prediction in the context of high-dimensional omics data. With a super learner, survival stacking combines the prediction from multiple sub-models that are independently trained using the features in pre-grouped biological pathways. In addition to a non-negative linear combination of sub-models, we extended the super learner to non-negative Bayesian hierarchical generalized linear model and artificial neural network. We compared the proposed modeling strategy with the widely used survival penalized method Lasso Cox and several group penalized methods, e.g., group Lasso Cox, via simulation study and real-world data application.Results The proposed survival stacking method showed superior and robust performance in terms of discrimination compared with single-model methods in case of high-noise simulated data and real-world data. The non-negative Bayesian stacking method can identify important biological signal pathways and genes that are associated with the prognosis of cancer.Conclusions This study proposed a novel survival stacking strategy incorporating biological group information into the cancer prognosis models. Additionally, this study extended the super learner to non-negative Bayesian model and ANN, enriching the combination of sub-models. The proposed Bayesian stacking strategy exhibited favorable properties in the prediction and interpretation of complex survival data, which may aid in discovering cancer targets.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] A novel ensemble method for high-dimensional genomic data classification
    Espichan, Alexandra
    Villanueva, Edwin
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2229 - 2236
  • [32] A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data
    Yang, Zi
    Michailidis, George
    BIOINFORMATICS, 2016, 32 (01) : 1 - 8
  • [33] Discriminative Clustering of High-Dimensional Data Using Generative Modeling
    Abdi, Masoud
    Lim, Chee Peng
    Mohamed, Shady
    Abbasnejad, Saeid Nahavandi Ehsan
    Van Den Hengel, Anton
    2018 IEEE 61ST INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2018, : 799 - 802
  • [34] Multi-omics data fusion using adaptive GTO guided Non-negative matrix factorization for cancer subtype discovery
    Bansal, Bhavana
    Sahoo, Anita
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2023, 228
  • [35] Fast and Accurate Non-Negative Latent Factor Analysis of High-Dimensional and Sparse Matrices in Recommender Systems
    Luo, Xin
    Zhou, Yue
    Liu, Zhigang
    Zhou, MengChu
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (04) : 3897 - 3911
  • [36] Efficient Extraction of Non-negative Latent Factors from High-dimensional and Sparse Matrices in Industrial Applications
    Luo, Xin
    Shang, Mingsheng
    Li, Shuai
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 311 - 319
  • [37] A Fine-Grained Regularization Scheme for Non-negative Latent Factorization of High-Dimensional and Incomplete Tensors
    Wu, Hao
    Qiao, Yan
    Luo, Xin
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (06) : 3006 - 3021
  • [38] Non-negative least squares for high-dimensional linear models: Consistency and sparse recovery without regularization
    Slawski, Martin
    Hein, Matthias
    ELECTRONIC JOURNAL OF STATISTICS, 2013, 7 : 3004 - 3056
  • [39] BAYESIAN JOINT MODELING OF HIGH-DIMENSIONAL DISCRETE MULTIVARIATE LONGITUDINAL DATA USING GENERALIZED LINEAR MIXED MODELS
    Hauser, Paloma
    Tan, Xianming
    Chen, Fang
    Chen, Ronald c.
    Ibrahim, Joseph g.
    ANNALS OF APPLIED STATISTICS, 2024, 18 (03): : 2326 - 2341
  • [40] Iterative Variable Selection for High-Dimensional Data: Prediction of Pathological Response in Triple-Negative Breast Cancer
    Laria, Juan C.
    Aguilera-Morillo, M. Carmen
    Alvarez, Enrique
    Lillo, Rosa E.
    Lopez-Taruella, Sara
    del Monte-Millan, Maria
    Picornell, Antonio C.
    Martin, Miguel
    Romo, Juan
    MATHEMATICS, 2021, 9 (03) : 1 - 14