Learning to Train and to Explain a Deep Survival Model with Large-Scale Ovarian Cancer Transcriptomic Data

被引：0

作者：

Menand, Elena Spirina ^{[1
,2
]}

De Vries-Brilland, Manon ^{[2
,3
]}

Tessier, Leslie ^{[2
]}

Dauve, Jonathan ^{[2
]}

Campone, Mario ^{[4
,5
]}

Verriele, Veronique ^{[6
]}

Jrad, Nisrine ^{[1
]}

Marion, Jean-Marie ^{[1
]}

Chauvet, Pierre ^{[1
]}

Passot, Christophe ^{[2
]}

Morel, Alain ^{[2
,5
]}

机构：

[1] Univ Angers, Lab Angevin Rech Ingn Syst EA7315, F-49035 Angers, France

[2] Inst Cancerol Ouest Nantes Angers, Unite Genom Fonct, F-49055 Angers, France

[3] Inst Cancerol Ouest Nantes Angers, F-49000 Angers, France

[4] Inst Cancerol Ouest Nantes Angers, F-49000 Angers, France

[5] Nantes Univ, Univ Angers, CNRS, Inserm,CRCI2NA,SFR ICAT, F-49000 Angers, France

[6] Inst Cancerol Ouest Nantes Angers, Dept Anat & Cytol Pathol, F-49055 Angers, France

来源：

BIOMEDICINES | 2024年 / 12卷 / 12期

关键词：

TCGA; ovarian cancer; RNA-seq; survival analysis; deep learning; molecular pathways; SIGNATURES;

D O I：

10.3390/biomedicines12122881

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Background/Objectives: Ovarian cancer is a complex disease with poor outcomes that affects women worldwide. The lack of successful therapeutic options for this malignancy has led to the need to identify novel biomarkers for patient stratification. Here, we aim to develop the outcome predictors based on the gene expression data as they may serve to identify categories of patients who are more likely to respond to certain therapies. Methods: We used The Cancer Genome Atlas (TCGA) ovarian cancer transcriptomic data from 372 patients and approximately 16,600 genes to train and evaluate the deep learning survival models. In addition, we collected an in-house validation dataset of 12 patients to assess the performance of the trained survival models for their direct use in clinical practice. Despite deceptive generalization capabilities, we demonstrated how our model can be interpreted to uncover biological processes associated with survival. We calculated the contributions of the input genes to the output of the best trained model and derived the corresponding molecular pathways. Results: These pathways allowed us to stratify the TCGA patients into high-risk and low-risk groups (p-value 0.025). We validated the stratification ability of the identified pathways on the in-house dataset consisting of 12 patients (p-value 0.229) and on the external clinical and molecular dataset consisting of 274 patients (p-value 0.006). Conclusions: The deep learning-based models for survival prediction with RNA-seq data could be used to detect and interpret the gene-sets associated with survival in ovarian cancer patients and open a new avenue for future research.

引用

页数：16

共 50 条

[31] Understanding Data Storage and Ingestion for Large-Scale Deep Recommendation Model Training
Zhao, Mark
Agarwal, Niket
Basant, Aarti
Gedik, Bugra
Pan, Satadru
Ozdal, Mustafa
Komuravelli, Rakesh
Pan, Jerry
Bao, Tianshu
Lu, Haowei
Narayanan, Sundaram
Langman, Jack
Wilfong, Kevin
Rastogi, Harsha
Wu, Carole-Jean
Kozyrakis, Christos
Pol, Parik
PROCEEDINGS OF THE 2022 THE 49TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '22), 2022, : 1042 - 1057
[32] Large-scale profiling of metabolic dysregulation in ovarian cancer
Ke, Chaofu
Hou, Yan
Zhang, Haiyu
Fan, Lijun
Ge, Tingting
Guo, Bing
Zhang, Fan
Yang, Kai
Wang, Jingtao
Lou, Ge
Li, Kang
INTERNATIONAL JOURNAL OF CANCER, 2015, 136 (03) : 516 - 526
[33] Machine learning based survival prediction in Glioma using large-scale registry data
Zhao, Rachel
Zhuge, Ying
Camphausen, Kevin
Krauze, Andra, V
HEALTH INFORMATICS JOURNAL, 2022, 28 (04)
[34] Deep Reinforcement Learning for Large-Scale Epidemic Control
Libin, Pieter J. K.
Moonens, Arno
Verstraeten, Timothy
Perez-Sanjines, Fabian
Hens, Niel
Lemey, Philippe
Nowe, Ann
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2020, PT V, 2021, 12461 : 155 - 170
[35] Deep learning large-scale drug discovery and repurposing
Yu, Min
Li, Weiming
Yu, Yunru
Zhao, Yu
Xiao, Lizhi
Lauschke, Volker M.
Cheng, Yiyu
Zhang, Xingcai
Wang, Yi
NATURE COMPUTATIONAL SCIENCE, 2024, 4 (08): : 600 - 614
[36] HammingMesh: A Network Topology for Large-Scale Deep Learning
Hoefler, Torsten
Bonato, Tommaso
De Sensi, Daniele
Di Girolamo, Salvatore
Li, Shigang
Heddes, Marco
Belk, Jon
Goel, Deepak
Castro, Miguel
Scott, Steve
SC22: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2022,
[37] HammingMesh: A Network Topology for Large-Scale Deep Learning
Hoefler, Torsten
Bonoto, Tommaso
De Sensi, Daniele
Di Girolamo, Salvatore
Li, Shigang
Heddes, Marco
Goel, Deepak
Castro, Miguel
Scott, Steve
COMMUNICATIONS OF THE ACM, 2024, 67 (12) : 97 - 105
[38] On Efficient Training of Large-Scale Deep Learning Models
Shen, Li
Sun, Yan
Yu, Zhiyuan
Ding, Liang
Tian, Xinmei
Tao, Dacheng
ACM COMPUTING SURVEYS, 2025, 57 (03)
[39] Deep learning based data augmentation for large-scale mineral image recognition and classification
Liu, Yang
Wang, Xueyi
Zhang, Zelin
Deng, Fang
MINERALS ENGINEERING, 2023, 204
[40] Deep Learning-Based Sentimental Analysis for Large-Scale Imbalanced Twitter Data
Jamal, Nasir
Chen, Xianqiao
Aldabbas, Hamza
FUTURE INTERNET, 2019, 11 (09)

← 1 2 3 4 5 →