Predicting Survival Outcomes in the Presence of Unlabeled Data

被引:3
|
作者
Haredasht, Fateme Nateghi [1 ,2 ,3 ]
Vens, Celine [1 ,2 ,3 ]
机构
[1] Katholieke Univ Leuven, Dept Publ Hlth & Primary Care, Campus KULAK,Etienne Sabbelaan 53, B-8500 Kortrijk, Belgium
[2] IMEC, ITEC, Etienne Sabbelaan 51, B-8500 Kortrijk, Belgium
[3] Katholieke Univ Leuven, Etienne Sabbelaan 51, B-8500 Kortrijk, Belgium
关键词
Survival analysis; Semi-supervised learning; Random survival forest; Self-training; REGRESSION; MODEL;
D O I
10.1007/s10994-022-06257-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many clinical studies require the follow-up of patients over time. This is challenging: apart from frequently observed drop-out, there are often also organizational and financial challenges, which can lead to reduced data collection and, in turn, can complicate subsequent analyses. In contrast, there is often plenty of baseline data available of patients with similar characteristics and background information, e.g., from patients that fall outside the study time window. In this article, we investigate whether we can benefit from the inclusion of such unlabeled data instances to predict accurate survival times. In other words, we introduce a third level of supervision in the context of survival analysis, apart from fully observed and censored instances, we also include unlabeled instances. We propose three approaches to deal with this novel setting and provide an empirical comparison over fifteen real-life clinical and gene expression survival datasets. Our results demonstrate that all approaches are able to increase the predictive performance over independent test data. We also show that integrating the partial supervision provided by censored data in a semi-supervised wrapper approach generally provides the best results, often achieving high improvements, compared to not using unlabeled data.
引用
收藏
页码:4139 / 4157
页数:19
相关论文
共 50 条
  • [11] Application of machine learning in predicting survival outcomes involving real-world data: a scoping review
    Huang, Yinan
    Li, Jieni
    Li, Mai
    Aparasu, Rajender R.
    BMC MEDICAL RESEARCH METHODOLOGY, 2023, 23 (01)
  • [12] Survival Topic Models for Predicting Outcomes for Trauma Patients
    Zhang, Yuanyang
    Jiang, Richard
    Petzold, Linda
    2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 1497 - 1504
  • [13] Application of machine learning in predicting survival outcomes involving real-world data: a scoping review
    Yinan Huang
    Jieni Li
    Mai Li
    Rajender R. Aparasu
    BMC Medical Research Methodology, 23
  • [14] PREDICTING SURVIVAL OUTCOMES FOR ELDERLY PATIENTS STARTING DIALYSIS
    Ekins, Stephanie
    Castledine, Clare
    NEPHROLOGY DIALYSIS TRANSPLANTATION, 2017, 32
  • [15] Predicting Gene Function with Positive and Unlabeled Examples
    Chen, Yiming
    Li, Zhoujun
    Hu, Xiaohua
    Diao, Hongxiang
    Liu, Junwan
    2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009), 2009, : 89 - +
  • [16] Feature selection for unlabeled data
    Dy, JG
    IEEE INTELLIGENT SYSTEMS, 2005, 20 (06) : 66 - 68
  • [17] Backdoor Cleansing with Unlabeled Data
    Pang, Lu
    Sun, Tao
    Ling, Haibin
    Chen, Chao
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12218 - 12227
  • [18] LogSumExp for Unlabeled Data Processing
    Hu, Taocheng
    Yu, Jinhui
    2017 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATIONS (SERA), 2017, : 63 - 69
  • [19] Feature Selection for Unlabeled Data
    Chen, Chien-Hsing
    ADVANCES IN SWARM INTELLIGENCE, PT II, 2011, 6729 : 269 - 274
  • [20] Classifying unlabeled data with SVMs
    Tao, Wu
    Zhao Hanqing
    APPLIED SOFT COMPUTING TECHNOLOGIES: THE CHALLENGE OF COMPLEXITY, 2006, 34 : 695 - 702