Predicting Survival Outcomes in the Presence of Unlabeled Data

被引:3
|
作者
Haredasht, Fateme Nateghi [1 ,2 ,3 ]
Vens, Celine [1 ,2 ,3 ]
机构
[1] Katholieke Univ Leuven, Dept Publ Hlth & Primary Care, Campus KULAK,Etienne Sabbelaan 53, B-8500 Kortrijk, Belgium
[2] IMEC, ITEC, Etienne Sabbelaan 51, B-8500 Kortrijk, Belgium
[3] Katholieke Univ Leuven, Etienne Sabbelaan 51, B-8500 Kortrijk, Belgium
关键词
Survival analysis; Semi-supervised learning; Random survival forest; Self-training; REGRESSION; MODEL;
D O I
10.1007/s10994-022-06257-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many clinical studies require the follow-up of patients over time. This is challenging: apart from frequently observed drop-out, there are often also organizational and financial challenges, which can lead to reduced data collection and, in turn, can complicate subsequent analyses. In contrast, there is often plenty of baseline data available of patients with similar characteristics and background information, e.g., from patients that fall outside the study time window. In this article, we investigate whether we can benefit from the inclusion of such unlabeled data instances to predict accurate survival times. In other words, we introduce a third level of supervision in the context of survival analysis, apart from fully observed and censored instances, we also include unlabeled instances. We propose three approaches to deal with this novel setting and provide an empirical comparison over fifteen real-life clinical and gene expression survival datasets. Our results demonstrate that all approaches are able to increase the predictive performance over independent test data. We also show that integrating the partial supervision provided by censored data in a semi-supervised wrapper approach generally provides the best results, often achieving high improvements, compared to not using unlabeled data.
引用
收藏
页码:4139 / 4157
页数:19
相关论文
共 50 条
  • [1] Predicting Survival Outcomes in the Presence of Unlabeled Data
    Fateme Nateghi Haredasht
    Celine Vens
    Machine Learning, 2022, 111 : 4139 - 4157
  • [2] Predicting survival outcomes in ovarian cancer using gene expression data
    Ahn, TaeJin
    Kang, Nayeon
    Kim, Yonggab
    Kim, Se Ik
    Song, Yong-Sang
    Park, Taesung
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2018, 21 (04) : 339 - 351
  • [3] Leveraging Unlabeled Data for Glioma Molecular Subtype and Survival Prediction
    Nuechterlein, Nicholas
    Li, Beibin
    Seyfioglu, Mehmet Saygin
    Mehta, Sachin
    Cimino, Patrick J.
    Shapiro, Linda
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7149 - 7156
  • [4] Predicting with Confidence from Survival Data
    Bostrom, Henrik
    Johansson, Ulf
    Vesterberg, Anders
    CONFORMAL AND PROBABILISTIC PREDICTION AND APPLICATIONS, VOL 105, 2019, 105
  • [5] Predicting Survival Outcomes in Women with Uterine Carcinosarcoma
    Cousins, A.
    Tian, C.
    Casablanca, Y.
    GYNECOLOGIC ONCOLOGY, 2020, 158 (01) : E13 - E14
  • [6] Leveraging Unlabeled Data
    Edwards, Chris
    COMMUNICATIONS OF THE ACM, 2020, 63 (06) : 13 - 14
  • [7] PREDICTING PHASE III SURVIVAL OUTCOMES USING PHASE II TRIAL DATA IN NSCLC AND RC
    Macaulay, R.
    Tan, H.
    VALUE IN HEALTH, 2015, 18 (03) : A192 - A192
  • [8] Vertical Selection in the Presence of Unlabeled Verticals
    Arguello, Jaime
    Diaz, Fernando
    Paiement, Jean-Francois
    SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 691 - 698
  • [9] BINARY CLASSIFICATION ONLY FROM UNLABELED DATA BY ITERATIVE UNLABELED-UNLABELED CLASSIFICATION
    Kaji, Hirotaka
    Sugiyama, Masashi
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3527 - 3531
  • [10] Benchmarking survival outcomes: A funnel plot for survival data
    Putter, Hein
    Eikema, Dirk-Jan
    de Wreede, Liesbeth C.
    McGrath, Eoin
    Sanchez-Ortega, Isabel
    Saccardi, Riccardo
    Snowden, John A.
    van Zwet, Erik W.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2022, 31 (06) : 1171 - 1183