A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery

被引:23
|
作者
Wang, Hao [1 ,2 ]
Zhang, Zhaoyue [3 ]
Li, Haicheng [1 ,2 ]
Li, Jinzhao [1 ]
Li, Hanshuang [1 ]
Liu, Mingzhu [1 ,2 ]
Liang, Pengfei [1 ]
Xi, Qilemuge [1 ]
Xing, Yongqiang [4 ]
Yang, Lei [5 ]
Zuo, Yongchun [1 ,2 ]
机构
[1] Inner Mongolia Univ, Coll Life Sci, State Key Lab Reprod Regulat & Breeding Grassland, Hohhot 010070, Peoples R China
[2] Inner Mongolia Wesure Date Technol Co Ltd, Inner Mongolia Intelligent Union Big Data Acad, Digital Coll, Hohhot 010010, Peoples R China
[3] Univ Elect Sci & Technol China, Ctr Informat Biol, Sch Life Sci & Technol, Chengdu 610054, Peoples R China
[4] Inner Mongolia Univ Sci & Technol, Sch Life Sci & Technol, Baotou 014010, Peoples R China
[5] Harbin Med Univ, Coll Bioinformat Sci & Technol, Harbin 150081, Peoples R China
来源
CELL AND BIOSCIENCE | 2023年 / 13卷 / 01期
关键词
Preeclampsia risk; Machine learning; Feature selection; Marker genes; Web server; SINGLE-CELL; CANCER CLASSIFICATION; DIFFERENTIATION; EXPRESSION; IDENTIFICATION; PREDICTION;
D O I
10.1186/s13578-023-00991-y
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background The placenta, as a unique exchange organ between mother and fetus, is essential for successful human pregnancy and fetal health. Preeclampsia (PE) caused by placental dysfunction contributes to both maternal and infant morbidity and mortality. Accurate identification of PE patients plays a vital role in the formulation of treatment plans. However, the traditional clinical methods of PE have a high misdiagnosis rate.Results Here, we first designed a computational biology method that used single-cell transcriptome (scRNA-seq) of healthy pregnancy (38 wk) and early-onset PE (28-32 wk) to identify pathological cell subpopulations and predict PE risk. Based on machine learning methods and feature selection techniques, we observed that the Tuning ReliefF (TURF) score hybrid with XGBoost (TURF_XGB) achieved optimal performance, with 92.61% accuracy and 92.46% recall for classifying nine cell subpopulations of healthy placentas. Biological landscapes of placenta heterogeneity could be mapped by the 110 marker genes screened by TURF_XGB, which revealed the superiority of the TURF feature mining. Moreover, we processed the PE dataset with LASSO to obtain 497 biomarkers. Integration analysis of the above two gene sets revealed that dendritic cells were closely associated with early-onset PE, and C1QB and C1QC might drive preeclampsia by mediating inflammation. In addition, an ensemble model-based risk stratification card was developed to classify preeclampsia patients, and its area under the receiver operating characteristic curve (AUC) could reach 0.99. For broader accessibility, we designed an accessible online web server ().Conclusion Single-cell transcriptome-based preeclampsia risk assessment using an ensemble machine learning framework is a valuable asset for clinical decision-making. C1QB and C1QC may be involved in the development and progression of early-onset PE by affecting the complement and coagulation cascades pathway that mediate inflammation, which has important implications for better understanding the pathogenesis of PE.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Machine Learning-Based Fast Seismic Risk Assessment of Building Structures
    Tang, Qi
    Dang, Ji
    Cui, Yao
    Wang, Xin
    Jia, Jinqing
    JOURNAL OF EARTHQUAKE ENGINEERING, 2022, 26 (15) : 8041 - 8062
  • [22] Machine Learning-Based Fault Injection for Hazard Analysis and Risk Assessment
    Oakes, Bentley James
    Moradi, Mehrdad
    Van Mierlo, Simon
    Vangheluwe, Hans
    Denil, Joachim
    COMPUTER SAFETY, RELIABILITY, AND SECURITY (SAFECOMP 2021), 2021, 12852 : 178 - 192
  • [23] A machine learning-based predictive model for risk assessment in airport areas
    Gugliandolo, Giovanni
    Caccamo, Maria Teresa
    Castorina, Giuseppe
    Chillemi, Domenica Letizia
    Famoso, Fabio
    Munao, Gianmarco
    Raffaele, Marcello
    Schifilliti, Valeria
    Semprebello, Agostino
    Magazu, Salvatore
    2021 IEEE 8TH INTERNATIONAL WORKSHOP ON METROLOGY FOR AEROSPACE (IEEE METROAEROSPACE), 2021, : 53 - 57
  • [24] Rapid Risk Assessment for Diabetes and Stroke a Cost-effective Method for Early Screening and Prevention
    Kamberia, Fatjona
    Jahob, Jerina
    Kamberic, Leonard
    METABOLISM-CLINICAL AND EXPERIMENTAL, 2021, 116 : 34 - 34
  • [25] Boreas: A Cost-Effective Mitigation Method for Advanced Hotspots using Machine Learning and Hardware Telemetry
    Amiraski, Maziar
    Werner, David
    Hankin, Alexander
    Sebot, Julien
    Vaidyanathan, Kaushik
    Hempstead, Mark
    2023 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, ISPASS, 2023, : 295 - 305
  • [26] A machine learning-based method for protein global model quality assessment
    Dong, Qiwen
    Chen, Yufei
    Zhou, Shuigeng
    INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 2011, 40 (04) : 417 - 425
  • [27] Machine learning-based discovery of vibrationally stable materials
    Tawfik, Sherif Abdulkader
    Rashid, Mahad
    Gupta, Sunil
    Russo, Salvy P.
    Walsh, Tiffany R.
    Venkatesh, Svetha
    NPJ COMPUTATIONAL MATERIALS, 2023, 9 (01)
  • [28] Machine learning-based discovery of vibrationally stable materials
    Sherif Abdulkader Tawfik
    Mahad Rashid
    Sunil Gupta
    Salvy P. Russo
    Tiffany R. Walsh
    Svetha Venkatesh
    npj Computational Materials, 9
  • [30] Risk assessment and cost-effective business modeling for network security
    Wei, HQ
    Frincke, D
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL I, PROCEEDINGS: INFORMATION SYSTEMS, TECHNOLOGIES AND APPLICATIONS, 2003, : 316 - 321