DeepKEGG: a multi-omics data integration framework with biological insights for cancer recurrence prediction and biomarker discovery

被引:14
|
作者
Lan, Wei [1 ]
Liao, Haibo [2 ]
Chen, Qingfeng [3 ]
Zhu, Lingzhi [4 ]
Pan, Yi [5 ]
Chen, Yi-Ping Phoebe [6 ]
机构
[1] Guangxi Univ, Sch Comp Elect & Informat, Nanning, Peoples R China
[2] Guangxi Univ, Comp Technol, Nanning, Peoples R China
[3] Guangxi Univ, State Key Lab Conservat & Utilizat Subtrop Agrobio, Nanning, Peoples R China
[4] Hunan Inst Technol, Sch Comp & Informat Sci, Hengyang 421002, Peoples R China
[5] Chinese Acad Sci, Shenzhen Inst Adv Technol, Sch Comp Sci & Control Engn, Shenzhen, Peoples R China
[6] La Trobe Univ, Dept Comp Sci & Informat Technol, Bundoora, Vic, Australia
基金
中国国家自然科学基金;
关键词
cancer recurrence prediction; interpretability of deep learning; self-attention mechanism; multi-omics data integration; HEPATOCELLULAR-CARCINOMA; BLADDER-CANCER; SIGNALING PATHWAY; PROLIFERATION; ACTIVATION; SURVIVAL;
D O I
10.1093/bib/bbae185
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Deep learning-based multi-omics data integration methods have the capability to reveal the mechanisms of cancer development, discover cancer biomarkers and identify pathogenic targets. However, current methods ignore the potential correlations between samples in integrating multi-omics data. In addition, providing accurate biological explanations still poses significant challenges due to the complexity of deep learning models. Therefore, there is an urgent need for a deep learning-based multi-omics integration method to explore the potential correlations between samples and provide model interpretability. Herein, we propose a novel interpretable multi-omics data integration method (DeepKEGG) for cancer recurrence prediction and biomarker discovery. In DeepKEGG, a biological hierarchical module is designed for local connections of neuron nodes and model interpretability based on the biological relationship between genes/miRNAs and pathways. In addition, a pathway self-attention module is constructed to explore the correlation between different samples and generate the potential pathway feature representation for enhancing the prediction performance of the model. Lastly, an attribution-based feature importance calculation method is utilized to discover biomarkers related to cancer recurrence and provide a biological interpretation of the model. Experimental results demonstrate that DeepKEGG outperforms other state-of-the-art methods in 5-fold cross validation. Furthermore, case studies also indicate that DeepKEGG serves as an effective tool for biomarker discovery. The code is available at https://github.com/lanbiolab/DeepKEGG.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] IMOVNN: incomplete multi-omics data integration variational neural networks for gut microbiome disease prediction and biomarker identification
    Hu, Mingyi
    Zhu, Jinlin
    Peng, Guohao
    Lu, Wenwei
    Wang, Hongchao
    Xie, Zhenping
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (06)
  • [42] Multi-omics data integration and modeling unravels new mechanisms for pancreatic cancer and improves prognostic prediction
    Fraunhoffer, Nicolas A.
    Abuelafia, Analia Meilerman
    Bigonnet, Martin
    Gayet, Odile
    Roques, Julie
    Nicolle, Remy
    Lomberk, Gwen
    Urrutia, Raul
    Dusetti, Nelson
    Iovanna, Juan
    NPJ PRECISION ONCOLOGY, 2022, 6 (01)
  • [43] Multi-omics integration for neuroblastoma clinical endpoint prediction
    Margherita Francescatto
    Marco Chierici
    Setareh Rezvan Dezfooli
    Alessandro Zandonà
    Giuseppe Jurman
    Cesare Furlanello
    Biology Direct, 13
  • [44] Multi-omics data integration and modeling unravels new mechanisms for pancreatic cancer and improves prognostic prediction
    Nicolas A. Fraunhoffer
    Analía Meilerman Abuelafia
    Martin Bigonnet
    Odile Gayet
    Julie Roques
    Remy Nicolle
    Gwen Lomberk
    Raul Urrutia
    Nelson Dusetti
    Juan Iovanna
    npj Precision Oncology, 6
  • [45] An integrated Bayesian framework for multi-omics prediction and classification
    Mallick, Himel
    Porwal, Anupreet
    Saha, Satabdi
    Basak, Piyali
    Svetnik, Vladimir
    Paul, Erina
    STATISTICS IN MEDICINE, 2024, 43 (05) : 983 - 1002
  • [46] Leveraging complementary multi-omics data integration methods for mechanistic insights in kidney diseases
    Alakwaa, Fadhl
    Das, Vivek
    Majumdar, Arindam
    Nair, Viji
    Fermin, Damian
    Dey, Asim B.
    Slidel, Timothy
    Reilly, Dermot F.
    Myshkin, Eugene
    Duffin, Kevin L.
    Chen, Yu
    Bitzer, Markus
    Pennathur, Subramaniam
    Brosius, Frank C.
    Kretzler, Matthias
    Ju, Wenjun
    Karihaloo, Anil
    Eddy, Sean
    JCI INSIGHT, 2025, 10 (05)
  • [47] Knowledge Base Commons (KBCommons) v1.1: a universal framework for multi-omics data integration and biological discoveries
    Zeng, Shuai
    Lyu, Zhen
    Narisetti, Siva Ratna Kumari
    Xu, Dong
    Joshi, Trupti
    BMC GENOMICS, 2019, 20 (Suppl 11)
  • [48] Knowledge Base Commons (KBCommons) v1.1: a universal framework for multi-omics data integration and biological discoveries
    Shuai Zeng
    Zhen Lyu
    Siva Ratna Kumari Narisetti
    Dong Xu
    Trupti Joshi
    BMC Genomics, 20
  • [49] Multi-omics Data Integration for Identifying Osteoporosis Biomarkers and Their Biological Interaction and Causal Mechanisms
    Qiu, Chuan
    Yu, Fangtang
    Su, Kuanjui
    Zhao, Qi
    Zhang, Lan
    Xu, Chao
    Hu, Wenxing
    Wang, Zun
    Zhao, Lanjuan
    Tian, Qing
    Wang, Yuping
    Deng, Hongwen
    Shen, Hui
    ISCIENCE, 2020, 23 (02)
  • [50] Self-omics: A Self-supervised Learning Framework for Multi-omics Cancer Data
    Hashim, Sayed
    Nandakumar, Karthik
    Yaqub, Mohammad
    BIOCOMPUTING 2023, PSB 2023, 2023, : 263 - 274