Construction of a diagnostic classifier for cervical intraepithelial neoplasia and cervical cancer based on XGBoost feature selection and random forest model

被引:3
|
作者
Zhang, Jing [1 ]
Yang, Xiuqing [1 ]
Chen, Jia [1 ]
Han, Jing [1 ]
Chen, Xiaofeng [1 ]
Fan, Yueping [1 ]
Zheng, Hui [1 ]
机构
[1] Jiangsu Xiangshui Hosp Chinese Med, Dept Gynaecol & Obstet, 2 Yinhe Rd, Yancheng 224600, Jiangsu, Peoples R China
关键词
cervical cancer; cervical intraepithelial neoplasia; diagnostic markers; PPI network; XGBoost; DIGITAL REPEAT PHOTOGRAPHY; IMAGE TIME-SERIES; CELL-CYCLE; PHENOLOGY; VEGETATION; APOPTOSIS;
D O I
10.1111/jog.15458
中图分类号
R71 [妇产科学];
学科分类号
100211 ;
摘要
Background The pathological phenotype of early-stage cervical cancer (CC) is similar to that of cervical intraepithelial neoplasia (CIN), which provides a challenge for the diagnosis of cervical precancerous lesions. Meanwhile, the existing diagnostic methods have certain subjectivity and limitations, resulting in the possibility of misdiagnosis or missed diagnosis. Hence, some methods are needed to assist diagnosis of CC and CIN. Methods Based on the data of CIN and CC in gene expression omnibus (GEO) dataset, the eXtreme Gradient Boosting (XGBoost) algorithm was used to screen the feature genes between CIN and CC for constructing the classifier. Incremental feature selection (IFS) curve was also used for screening. The classifier was validated for reliability using principal component analysis (PCA) dimensionality reduction analysis and heat map analysis of gene expression. Then, differentially expressed genes of CIN and CC were intersected with the classifier genes. Genes in the intersection were used as seeds for protein-protein interaction network construction and restart random walk analysis. And the genes with the top 50 affinity coefficients were selected for gene ontology (GO) and kyoto encyclopedia of genes and genome (KEGG) enrichment analyses to observe the biological functions with differences between CIN and CC. Results The peripheral blood genes of CIN and CC were analyzed, and seven genes were screened. Using this gene for classifier construction, IFS curve screening revealed that the three-feature gene classifier constructed according to the random forest model had the best effect. The results of PCA dimensionality reduction analysis and gene expression heat map analysis showed that the three-gene classifier could effectively distinguish CIN from CC. Conclusion A three-gene diagnostic classifier can effectively distinguish CIN patients from CC patients and provide a reference for the clinical diagnosis of early CC.
引用
收藏
页码:296 / 303
页数:8
相关论文
共 50 条
  • [21] Circulating Soluble Neuropilin-1 in Patients with Early Cervical Cancer and Cervical Intraepithelial Neoplasia Can Be Used as a Valuable Diagnostic Biomarker
    Yang, Shouhua
    Cheng, Henghui
    Huang, Zaiju
    Wang, Xiaoling
    Wan, Yinglu
    Cai, Jing
    Wang, Zehua
    DISEASE MARKERS, 2015, 2015
  • [22] A population-based study on the risk of cervical cancer and cervical intraepithelial neoplasia among grand multiparous women in Finland
    Hinkula, M
    Pukkala, E
    Kyyrönen, P
    Laukkanen, P
    Koskela, P
    Paavonen, J
    Lehtinen, M
    Kauppila, A
    BRITISH JOURNAL OF CANCER, 2004, 90 (05) : 1025 - 1029
  • [23] A population-based study on the risk of cervical cancer and cervical intraepithelial neoplasia among grand multiparous women in Finland
    M Hinkula
    E Pukkala
    P Kyyrönen
    P Laukkanen
    P Koskela
    J Paavonen
    M Lehtinen
    A Kauppila
    British Journal of Cancer, 2004, 90 : 1025 - 1029
  • [24] Intrusion Detection Model Based on Feature Selection and Random Forest
    Dong, Rui Hong
    Shui, Yong Li
    Zhang, Qiu Yu
    International Journal of Network Security, 2021, 23 (06) : 985 - 996
  • [25] Incidence and obstetrical outcomes of cervical intraepithelial neoplasia and cervical cancer in pregnancyA population-based study on 8.8 million births
    Hani Al-Halal
    Abbas Kezouh
    Haim A. Abenhaim
    Archives of Gynecology and Obstetrics, 2013, 287 : 245 - 250
  • [26] Feature selection for outcome prediction in oesophageal cancer using genetic algorithm and random forest classifier
    Paul, Desbordes
    Su, Ruan
    Romain, Modzelewski
    Sebastien, Vauclin
    Pierre, Vera
    Isabelle, Gardin
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2017, 60 : 42 - 49
  • [27] Cytology-based Screening for Anal Intraepithelial Neoplasia in Immunocompetent Brazilian Women with a History of High-Grade Cervical Intraepithelial Neoplasia or Cancer
    Rodrigues Brum, Vivian de Oliveira
    Oliveira Tricoti, Alessandra de Souza
    Pannain, Gabriel Duque
    Drumond, Denise Gasparetii
    Goncalves Leite, Isabel Cristina
    REVISTA BRASILEIRA DE GINECOLOGIA E OBSTETRICIA, 2022, 44 (07): : 678 - 685
  • [28] Identification of Circulating MicroRNAs as a Promising Diagnostic Biomarker for Cervical Intraepithelial Neoplasia and Early Cancer: A Meta-Analysis
    Jiang, Yao
    Hu, Zuohong
    Zuo, Zhihua
    Li, Yiqin
    Pu, Fei
    Wang, Biqiong
    Tang, Yan
    Guo, Yongcan
    Tao, Hualin
    BIOMED RESEARCH INTERNATIONAL, 2020, 2020
  • [29] DNA methylation-based detection and prediction of cervical intraepithelial neoplasia grade 3 and invasive cervical cancer with the WID™-qCIN test
    Herzog, Chiara
    Sundstrom, Karin
    Jones, Allison
    Evans, Iona
    Barrett, James E.
    Wang, Jiangrong
    Redl, Elisa
    Schreiberhuber, Lena
    Costas, Laura
    Paytubi, Sonia
    Dostalek, Lukas
    Zikan, Michal
    Cibula, David
    Sroczynski, Gaby
    Siebert, Uwe
    Dillner, Joakim
    Widschwendter, Martin
    CLINICAL EPIGENETICS, 2022, 14 (01)
  • [30] DNA methylation-based detection and prediction of cervical intraepithelial neoplasia grade 3 and invasive cervical cancer with the WID™-qCIN test
    Chiara Herzog
    Karin Sundström
    Allison Jones
    Iona Evans
    James E. Barrett
    Jiangrong Wang
    Elisa Redl
    Lena Schreiberhuber
    Laura Costas
    Sonia Paytubi
    Lukas Dostalek
    Michal Zikan
    David Cibula
    Gaby Sroczynski
    Uwe Siebert
    Joakim Dillner
    Martin Widschwendter
    Clinical Epigenetics, 2022, 14