Predicting second breast cancer among women with primary breast cancer using machine learning algorithms, a population-based observational study

被引:3
|
作者
Syleouni, Maria-Eleni [1 ,2 ]
Karavasiloglou, Nena [1 ,3 ]
Manduchi, Laura [4 ]
Wanner, Miriam [2 ]
Korol, Dimitri [2 ]
Ortelli, Laura [5 ]
Bordoni, Andrea [5 ]
Rohrmann, Sabine [1 ,2 ,6 ]
机构
[1] Univ Zurich, Epidemiol Biostat & Prevent Inst, Div Chron Dis Epidemiol, Zurich, Switzerland
[2] Univ Hosp Zurich, Canc Registry Zurich Zug Schaffhausen & Schwyz, Zurich, Switzerland
[3] European Food Safety Author, Parma, Italy
[4] Swiss Fed Inst Technol, Med Data Sci, Zurich, Switzerland
[5] Ticino Canc Registry, Publ Hlth Div Canton Ticino, Locarno, Switzerland
[6] Univ Zurich, Epidemiol Biostat & Prevent Inst, Hirschengraben 84, CH-8001 Zurich, Switzerland
关键词
breast cancer; cancer registry; machine learning; prediction; second cancer; RISK-FACTORS; LOCAL RECURRENCE; PROGNOSIS;
D O I
10.1002/ijc.34568
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Breast cancer survivors often experience recurrence or a second primary cancer. We developed an automated approach to predict the occurrence of any second breast cancer (SBC) using patient-level data and explored the generalizability of the models with an external validation data source. Breast cancer patients from the cancer registry of Zurich, Zug, Schaffhausen, Schwyz (N = 3213; training dataset) and the cancer registry of Ticino (N = 1073; external validation dataset), diagnosed between 2010 and 2018, were used for model training and validation, respectively. Machine learning (ML) methods, namely a feed-forward neural network (ANN), logistic regression, and extreme gradient boosting (XGB) were employed for classification. The best-performing model was selected based on the receiver operating characteristic (ROC) curve. Key characteristics contributing to a high SBC risk were identified. SBC was diagnosed in 6% of all cases. The most important features for SBC prediction were age at incidence, year of birth, stage, and extent of the pathological primary tumor. The ANN model had the highest area under the ROC curve with 0.78 (95% confidence interval [CI] 0.750.82) in the training data and 0.70 (95% CI 0.61-0.79) in the external validation data. Investigating the generalizability of different ML algorithms, we found that the ANN generalized better than the other models on the external validation data. This research is a first step towards the development of an automated tool that could assist clinicians in the identification of women at high risk of developing an SBC and potentially preventing it.
引用
收藏
页码:932 / 941
页数:10
相关论文
共 50 条
  • [1] Predicting second breast cancers among women diagnosed with primary breast cancer using patient-level data and machine learning algorithms
    Syleouni, Maria Eleni
    Karavasiloglou, Nena
    Manduchi, Laura
    Wanner, Miriam
    Korol, Dimitri
    Rohrmann, Sabine
    CANCER RESEARCH, 2022, 82 (12)
  • [2] Risk of second primary cancer among women with breast cancer: A population-based study in Granada (Spain)
    Molina-Montes, Esther
    Pollan, Marina
    Payer, Tilman
    Molina, Elena
    Davila-Arias, Cristina
    Sanchez, Maria-Jose
    GYNECOLOGIC ONCOLOGY, 2013, 130 (02) : 340 - 345
  • [3] Increased risk of breast cancer-specific mortality among women with second primary breast cancer: A SEER population-based study
    Wang, C.
    Hu, K.
    Zheng, H.
    Lu, D.
    ANNALS OF ONCOLOGY, 2019, 30
  • [4] TIMELINESS OF CARE AMONG ELDERLY WOMEN WITH BREAST CANCER: A POPULATION-BASED OBSERVATIONAL STUDY
    Vyas, A.
    Madhavan, S. S.
    Sambamoorthi, U.
    VALUE IN HEALTH, 2016, 19 (03) : A26 - A26
  • [5] Risk of second primary cancer in the contralateral breast in women treated for early-stage breast cancer: A population-based study
    Gao, X
    Fisher, SG
    Emami, B
    INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2003, 56 (04): : 1038 - 1045
  • [6] Predicting the recurrence of breast cancer using machine learning algorithms
    Amal Alzu’bi
    Hassan Najadat
    Wesam Doulat
    Osama Al-Shari
    Leming Zhou
    Multimedia Tools and Applications, 2021, 80 : 13787 - 13800
  • [7] Predicting the recurrence of breast cancer using machine learning algorithms
    Alzu'bi, Amal
    Najadat, Hassan
    Doulat, Wesam
    Al-Shari, Osama
    Zhou, Leming
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (09) : 13787 - 13800
  • [8] Second Primary Lung Cancer After Breast Cancer: A Population-Based Study of 6,269 Women
    Wang, Rong
    Yin, Zhiqiang
    Liu, Lingxiang
    Gao, Wen
    Li, Wei
    Shu, Yongqian
    Xu, Jiali
    FRONTIERS IN ONCOLOGY, 2018, 8
  • [9] The risk of second primary malignancy among male breast cancer patients: A population-based study
    Hung, Man-Hsin
    Liu, Chun-Yu
    Tzeng, Cheng-Hwai
    Chao, Yee
    Liu, Chia-Jen
    JOURNAL OF CLINICAL ONCOLOGY, 2014, 32 (15)
  • [10] Second primary cancers among females with a first primary breast cancer: a population-based study in Northern Portugal
    Elisabete Gonçalves
    Filipa Fontes
    Jéssica Rocha Rodrigues
    Rita Calisto
    Maria José Bento
    Nuno Lunet
    Samantha Morais
    Breast Cancer Research and Treatment, 2024, 204 : 367 - 376