A Data-Driven Comparative Analysis of Machine-Learning Models for Familial Hypercholesterolemia Detection

被引:0
|
作者
Kocejko, Tomasz [1 ]
机构
[1] Gdansk Univ Technol, Fac Elect Telecommun & Informat, Dept Biomed Engn, PL-80233 Gdansk, Poland
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 23期
关键词
machine learning; familial hypercholesterolemia; DLCN; model ensembles; DIAGNOSIS; POPULATION;
D O I
10.3390/app142311187
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Featured Application The presented study can contribute to increasing the familial hypercholesterolemia classification and may help reduce the number of undiagnosed cases of the disease. Abstract This study presents an assessment of familial hypercholesterolemia (FH) probability using different algorithms (CatBoost, XGBoost, Random Forest, SVM) and its ensembles, leveraging electronic health record data. The primary objective is to explore an enhanced method for estimating FH probability, surpassing the currently recommended Dutch Lipid Clinic Network (DLCN) Score. The models were trained using the largest Polish cohort of patients enrolled in an FH clinic, all of whom underwent genetic testing for FH-associated mutations. The initial dataset comprised over 100 parameters per patient, which was reduced to 48 clinically accessible features to ensure applicability in routine outpatient settings. To preserve balance, the data were stratified according to DLCN score ranges (<0-2>, <3-5>, <6-8>, and >= 9), representing varying levels of FH likelihood. The dataset was then split into training and test sets with an 80/20 ratio. Machine-learning models were trained, with hyperparameters optimized via grid search. The accuracy of the DLCN score in predicting FH was first evaluated by examining the proportion of patients with positive DNA tests relative to those with a DLCN score of 6 and above, the threshold for genetic testing. The DLCN score demonstrated an accuracy of approximately 40%. In contrast, the CatBoost model and its ensembles achieved over 80% accuracy. While the DLCN score remains a clinically valuable tool, its diagnostic accuracy is limited. The findings indicate that the ML models offer a substantial improvement in the precision of FH diagnosis, demonstrating its potential to enhance clinical decision making in identifying patients with FH.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Comparative Investigation of Traditional Machine-Learning Models and Transformer Models for Phishing Email Detection
    Melendez, Rene
    Ptaszynski, Michal
    Masui, Fumito
    ELECTRONICS, 2024, 13 (24):
  • [22] Machine-learning data-driven modeling of laminar-turbulent transition in compressor cascade
    Li, Zhen
    Ju, Yaping
    Zhang, Chuhua
    PHYSICS OF FLUIDS, 2023, 35 (08)
  • [23] DATA-DRIVEN DISEASE PROGRESSION PATTERNS OF BRAIN MORPHOLOGY IN SCHIZOPHRENIA: A MACHINE-LEARNING APPROACH
    Sone, D.
    Young, A.
    Shinagawa, S.
    Noda, Y.
    Iwata, Y.
    Tarumi, R.
    Ogyu, K.
    Honda, S.
    Tsugawa, S.
    Matsushita, K.
    Ueno, F.
    Hondo, N.
    Koreki, A.
    Kim, J.
    Caravaggio, F.
    Torres-Carmona, E.
    Mar, W.
    Chan, N.
    Plitman, E.
    Koizumi, T.
    Kato, H.
    Kusudo, K.
    de Luca, V.
    Gerretsen, P.
    Remington, G.
    Onaya, M.
    Uchida, H.
    Mimura, M.
    Shigeta, M.
    Graff-Guerrero, A.
    Nakajima, S.
    AUSTRALIAN AND NEW ZEALAND JOURNAL OF PSYCHIATRY, 2022, 56 (1_SUPPL): : 247 - 247
  • [24] Comparative Performance Assessment of Physical-Based and Data-Driven Machine-Learning Models for Simulating Streamflow: A Case Study in Three Catchments across the US
    Jin, Aohan
    Wang, Quanrong
    Zhan, Hongbin
    Zhou, Renjie
    JOURNAL OF HYDROLOGIC ENGINEERING, 2024, 29 (02)
  • [25] Data-driven machine-learning models for predicting non-uniform confinement effects of FRP-confined concrete
    Xie, Jian
    Jia, Chenhang
    Wang, Zhe
    STRUCTURES, 2025, 74
  • [26] Classification of machine learning frameworks for data-driven thermal fluid models
    Chang, Chih-Wei
    Dinh, Nam T.
    INTERNATIONAL JOURNAL OF THERMAL SCIENCES, 2019, 135 : 559 - 579
  • [27] Machine Learning Models for Data-Driven Prediction of Diabetes by Lifestyle Type
    Qin, Yifan
    Wu, Jinlong
    Xiao, Wen
    Wang, Kun
    Huang, Anbing
    Liu, Bowen
    Yu, Jingxuan
    Li, Chuhao
    Yu, Fengyu
    Ren, Zhanbing
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (22)
  • [28] Efficient Data-Driven Machine Learning Models for Water Quality Prediction
    Dritsas, Elias
    Trigka, Maria
    COMPUTATION, 2023, 11 (02)
  • [29] Review of Challenges and Opportunities in Turbulence Modeling: A Comparative Analysis of Data-Driven Machine Learning Approaches
    Zhang, Yi
    Zhang, Dapeng
    Jiang, Haoyu
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (07)
  • [30] Data-Driven Thermal Deviation Prediction in Turning Machine-Tool - A Comparative Analysis of Machine Learning Algorithms
    Ouerhani, Nabil
    Loehr, Bernard
    Rizzotti-Kaddouri, Aicha
    Santo De Pinho, Dylan
    Limat, Adrien
    Schinderholz, Philippe
    3RD INTERNATIONAL CONFERENCE ON INDUSTRY 4.0 AND SMART MANUFACTURING, 2022, 200 : 185 - 193