Learning to Train and to Explain a Deep Survival Model with Large-Scale Ovarian Cancer Transcriptomic Data

被引:0
|
作者
Menand, Elena Spirina [1 ,2 ]
De Vries-Brilland, Manon [2 ,3 ]
Tessier, Leslie [2 ]
Dauve, Jonathan [2 ]
Campone, Mario [4 ,5 ]
Verriele, Veronique [6 ]
Jrad, Nisrine [1 ]
Marion, Jean-Marie [1 ]
Chauvet, Pierre [1 ]
Passot, Christophe [2 ]
Morel, Alain [2 ,5 ]
机构
[1] Univ Angers, Lab Angevin Rech Ingn Syst EA7315, F-49035 Angers, France
[2] Inst Cancerol Ouest Nantes Angers, Unite Genom Fonct, F-49055 Angers, France
[3] Inst Cancerol Ouest Nantes Angers, F-49000 Angers, France
[4] Inst Cancerol Ouest Nantes Angers, F-49000 Angers, France
[5] Nantes Univ, Univ Angers, CNRS, Inserm,CRCI2NA,SFR ICAT, F-49000 Angers, France
[6] Inst Cancerol Ouest Nantes Angers, Dept Anat & Cytol Pathol, F-49055 Angers, France
关键词
TCGA; ovarian cancer; RNA-seq; survival analysis; deep learning; molecular pathways; SIGNATURES;
D O I
10.3390/biomedicines12122881
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background/Objectives: Ovarian cancer is a complex disease with poor outcomes that affects women worldwide. The lack of successful therapeutic options for this malignancy has led to the need to identify novel biomarkers for patient stratification. Here, we aim to develop the outcome predictors based on the gene expression data as they may serve to identify categories of patients who are more likely to respond to certain therapies. Methods: We used The Cancer Genome Atlas (TCGA) ovarian cancer transcriptomic data from 372 patients and approximately 16,600 genes to train and evaluate the deep learning survival models. In addition, we collected an in-house validation dataset of 12 patients to assess the performance of the trained survival models for their direct use in clinical practice. Despite deceptive generalization capabilities, we demonstrated how our model can be interpreted to uncover biological processes associated with survival. We calculated the contributions of the input genes to the output of the best trained model and derived the corresponding molecular pathways. Results: These pathways allowed us to stratify the TCGA patients into high-risk and low-risk groups (p-value 0.025). We validated the stratification ability of the identified pathways on the in-house dataset consisting of 12 patients (p-value 0.229) and on the external clinical and molecular dataset consisting of 274 patients (p-value 0.006). Conclusions: The deep learning-based models for survival prediction with RNA-seq data could be used to detect and interpret the gene-sets associated with survival in ovarian cancer patients and open a new avenue for future research.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Deep learning for the large-scale cancer data analysis
    Tsuji, Shingo
    Aburatani, Hiroyuki
    CANCER RESEARCH, 2015, 75 (22)
  • [2] A Novel Pruning Model of Deep Learning for Large-Scale Distributed Data Processing
    Sheng, Yiqiang
    Li, Chaopeng
    Wang, Jinlin
    Deng, Haojiang
    Zhao, Zhenyu
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 314 - 319
  • [3] TranSeqAnnotator: large-scale analysis of transcriptomic data
    Menon, Ranjeeta
    Garg, Gagan
    Gasser, Robin B.
    Ranganathan, Shoba
    BMC BIOINFORMATICS, 2012, 13
  • [4] TranSeqAnnotator: large-scale analysis of transcriptomic data
    Ranjeeta Menon
    Gagan Garg
    Robin B Gasser
    Shoba Ranganathan
    BMC Bioinformatics, 13
  • [5] Large-scale Deep Learning at Baidu
    Yu, Kai
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 2211 - 2211
  • [6] Alleviating Load Imbalance in Data Processing for Large-Scale Deep Learning
    Pumma, Sarunya
    Buono, Daniele
    Checconi, Fabio
    Que, Xinyu
    Feng, Wu-chun
    2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020), 2020, : 262 - 271
  • [7] Efficient Learning of Fuzzy Logic Systems for Large-Scale Data Using Deep Learning
    Koklu, Ata
    Guven, Yusuf
    Kumbasar, Tufan
    INTELLIGENT AND FUZZY SYSTEMS, INFUS 2024 CONFERENCE, VOL 1, 2024, 1088 : 406 - 413
  • [8] Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey
    Giang Nguyen
    Stefan Dlugolinsky
    Martin Bobák
    Viet Tran
    Álvaro López García
    Ignacio Heredia
    Peter Malík
    Ladislav Hluchý
    Artificial Intelligence Review, 2019, 52 : 77 - 124
  • [9] Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey
    Nguyen, Giang
    Dlugolinsky, Stefan
    Bobak, Martin
    Viet Tran
    Lopez Garcia, Alvaro
    Heredia, Ignacio
    Malik, Peter
    Hluchy, Ladislav
    ARTIFICIAL INTELLIGENCE REVIEW, 2019, 52 (01) : 77 - 124
  • [10] Large-scale transport simulation by deep learning
    Jie Pan
    Nature Computational Science, 2021, 1 : 306 - 306