Predicting second breast cancer among women with primary breast cancer using machine learning algorithms, a population-based observational study

被引:3
|
作者
Syleouni, Maria-Eleni [1 ,2 ]
Karavasiloglou, Nena [1 ,3 ]
Manduchi, Laura [4 ]
Wanner, Miriam [2 ]
Korol, Dimitri [2 ]
Ortelli, Laura [5 ]
Bordoni, Andrea [5 ]
Rohrmann, Sabine [1 ,2 ,6 ]
机构
[1] Univ Zurich, Epidemiol Biostat & Prevent Inst, Div Chron Dis Epidemiol, Zurich, Switzerland
[2] Univ Hosp Zurich, Canc Registry Zurich Zug Schaffhausen & Schwyz, Zurich, Switzerland
[3] European Food Safety Author, Parma, Italy
[4] Swiss Fed Inst Technol, Med Data Sci, Zurich, Switzerland
[5] Ticino Canc Registry, Publ Hlth Div Canton Ticino, Locarno, Switzerland
[6] Univ Zurich, Epidemiol Biostat & Prevent Inst, Hirschengraben 84, CH-8001 Zurich, Switzerland
关键词
breast cancer; cancer registry; machine learning; prediction; second cancer; RISK-FACTORS; LOCAL RECURRENCE; PROGNOSIS;
D O I
10.1002/ijc.34568
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Breast cancer survivors often experience recurrence or a second primary cancer. We developed an automated approach to predict the occurrence of any second breast cancer (SBC) using patient-level data and explored the generalizability of the models with an external validation data source. Breast cancer patients from the cancer registry of Zurich, Zug, Schaffhausen, Schwyz (N = 3213; training dataset) and the cancer registry of Ticino (N = 1073; external validation dataset), diagnosed between 2010 and 2018, were used for model training and validation, respectively. Machine learning (ML) methods, namely a feed-forward neural network (ANN), logistic regression, and extreme gradient boosting (XGB) were employed for classification. The best-performing model was selected based on the receiver operating characteristic (ROC) curve. Key characteristics contributing to a high SBC risk were identified. SBC was diagnosed in 6% of all cases. The most important features for SBC prediction were age at incidence, year of birth, stage, and extent of the pathological primary tumor. The ANN model had the highest area under the ROC curve with 0.78 (95% confidence interval [CI] 0.750.82) in the training data and 0.70 (95% CI 0.61-0.79) in the external validation data. Investigating the generalizability of different ML algorithms, we found that the ANN generalized better than the other models on the external validation data. This research is a first step towards the development of an automated tool that could assist clinicians in the identification of women at high risk of developing an SBC and potentially preventing it.
引用
收藏
页码:932 / 941
页数:10
相关论文
共 50 条
  • [41] Identification of women at risk of hereditary breast–ovarian cancer among participants in a population-based breast cancer screening
    Luigina Bonelli
    Ivana Valle
    Ivana Rebora
    Paola Ricci
    Lidia Biocchi
    Giovanna Bruschi
    Sabrina Parodi
    Carla Bruzzone
    Liliana Varesco
    Familial Cancer, 2022, 21 : 309 - 318
  • [42] Type of breast reconstructive surgery among breast cancer patients: A population-based study
    Polednak, AP
    PLASTIC AND RECONSTRUCTIVE SURGERY, 2001, 108 (06) : 1600 - 1603
  • [43] Impact of Breast Cancer Subtypes on Prognosis of Women with Operable Invasive Breast Cancer: A Population-based Study Using SEER Database
    Hwang, Ki-Tae
    Kim, Jongjin
    Jung, Jiwoong
    Chang, Ji Hyun
    Chai, Young Jun
    Oh, So Won
    Oh, Sohee
    Kim, Young A.
    Park, Sung Bae
    Hwang, Kyu Ri
    CLINICAL CANCER RESEARCH, 2019, 25 (06) : 1970 - 1979
  • [44] Risk factors by molecular subtypes of breast cancer among a population-based study of women diagnosed with breast cancer before age 55
    Gaudet, Mia
    Haile, Robert
    Bernstein, Jonine
    CANCER RESEARCH, 2009, 69
  • [45] Inflammatory breast cancer outcomes by breast cancer subtype: a population-based study
    Wu, San-Gang
    Zhang, Wen-Wen
    Wang, Jun
    Dong, Yong
    Sun, Jia-Yuan
    Chen, Yong-Xiong
    He, Zhen-Yu
    FUTURE ONCOLOGY, 2019, 15 (05) : 507 - 516
  • [46] Prognostic factors of primary neuroendocrine breast cancer: A population-based study
    Ma, Shu-tao
    Wang, Ding-yuan
    Liu, Yi-bing
    Tan, Hui-jing
    Ge, Yue-yue
    Chi, Yihebali
    Zhang, Bai-lin
    CANCER MEDICINE, 2022, 11 (13): : 2533 - 2540
  • [47] Patterns of comorbidities in women with breast cancer: a Canadian population-based study
    Huah Shin Ng
    Agnes Vitry
    Bogda Koczwara
    David Roder
    Mary L. McBride
    Cancer Causes & Control, 2019, 30 : 931 - 941
  • [48] Patterns of comorbidities in women with breast cancer: a Canadian population-based study
    Ng, Huah Shin
    Vitry, Agnes
    Koczwara, Bogda
    Roder, David
    McBride, Mary
    ASIA-PACIFIC JOURNAL OF CLINICAL ONCOLOGY, 2018, 14 : 75 - 75
  • [49] Survival of women with inflammatory breast cancer: a large population-based study
    Dawood, S.
    Lei, X.
    Dent, R.
    Gupta, S.
    Sirohi, B.
    Cortes, J.
    Cristofanilli, M.
    Buchholz, T.
    Gonzalez-Angulo, A. M.
    ANNALS OF ONCOLOGY, 2014, 25 (06) : 1143 - 1151
  • [50] BISPHOSPHONATES AND THE RISK OF BREAST CANCER IN OSTEOPOROTIC WOMEN: A POPULATION-BASED STUDY
    Rouach, V.
    Goldstein, I.
    Chodick, G.
    Stern, N.
    Catane, R.
    Cohen, D.
    OSTEOPOROSIS INTERNATIONAL, 2014, 25 : S381 - S381