Machine learning based study for the classification of Type 2 diabetes mellitus subtypes

被引:2
|
作者
Ordonez-Guillen, Nelson E. [1 ]
Gonzalez-Compean, Jose Luis [1 ]
Lopez-Arevalo, Ivan [1 ]
Contreras-Murillo, Miguel [1 ]
Aldana-Bobadilla, Edwin [2 ]
机构
[1] Cinvestav Tamaulipas, Carretera Victoria Soto Marina km 5-5, Victoria 87130, Tamaulipas, Mexico
[2] CONAHCYT Ctr Invest & Estudios Avanzados IPN, Unidad Tamaulipas, Carretera Victoria Soto Marina km 5-5, Victoria 87130, Tamaulipas, Mexico
关键词
Diabetes; Diabetes subtypes; Data-driven; Classification; HOMEOSTASIS MODEL ASSESSMENT; VALIDATION; SUBGROUPS; PREDICTION; ALGORITHM; SELECTION; GLUCOSE;
D O I
10.1186/s13040-023-00340-2
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Purpose: Data-driven diabetes research has increased its interest in exploring the heterogeneity of the disease, aiming to support in the development of more specific prognoses and treatments within the so-called precision medicine. Recently, one of these studies found five diabetes subgroups with varying risks of complications and treatment responses. Here, we tackle the development and assessment of different models for classifying Type 2 Diabetes (T2DM) subtypes through machine learning approaches, with the aim of providing a performance comparison and new insights on the matter. Methods: We developed a three-stage methodology starting with the preprocessing of public databases NHANES (USA) and ENSANUT (Mexico) to construct a dataset with N = 10,077 adult diabetes patient records. We used N = 2,768 records for training/validation of models and left the remaining (N = 7,309) for testing. In the second stage, groups of observations-each one representing a T2DM subtype- were identified. We tested different clustering techniques and strategies and validated them by using internal and external clustering indices; obtaining two annotated datasets Dset A and Dset B. In the third stage, we developed different classification models assaying four algorithms, seven input-data schemes, and two validation settings on each annotated dataset. We also tested the obtained models using a majority-vote approach for classifying unseen patient records in the hold- out dataset. Results: From the independently obtained bootstrap validation for Dset A and Dset B, mean accuracies across all seven data schemes were 85.3% (+/- 9.2%) and 97.1% (+/- 3.4%), respectively. Best accuracies were 98.8% and 98.9%. Both validation setting results were consistent. For the hold-out dataset, results were consonant with most of those obtained in the literature in terms of class proportions. Conclusion: The development of machine learning systems for the classification of diabetes subtypes constitutes an important task to support physicians for fast and timely decision-making. We expect to deploy this methodology in a data analysis platform to conduct studies for identifying T2DM subtypes in patient records from hospitals.
引用
收藏
页数:37
相关论文
共 50 条
  • [41] A patient network-based machine learning model for disease prediction: The case of type 2 diabetes mellitus
    Lu, Haohui
    Uddin, Shahadat
    Hajati, Farshid
    Moni, Mohammad Ali
    Khushi, Matloob
    APPLIED INTELLIGENCE, 2022, 52 (03) : 2411 - 2422
  • [42] Machine learning reveals connections between preclinical type 2 diabetes subtypes and brain health
    Yi, Fan
    Yuan, Jing
    Han, Fei
    Somekh, Judith
    Peleg, Mor
    Wu, Fei
    Jia, Zhilong
    Zhu, Yi-Cheng
    Huang, Zhengxing
    BRAIN, 2025,
  • [43] A Fuzzy Approach for Diabetes Mellitus Type 2 Classification
    Bressan, Glaucia Maria
    Flamia de Azevedo, Beatriz Cristina
    de Souza, Roberto Molina
    BRAZILIAN ARCHIVES OF BIOLOGY AND TECHNOLOGY, 2020, 63
  • [44] Autoimmune type 2 diabetes Mellitus and rational classification
    Rosenbloom, Arlan L.
    JOURNAL OF PEDIATRIC ENDOCRINOLOGY & METABOLISM, 2007, 20 (09): : 957 - 959
  • [45] HETEROGENICITY OF TYPE 2 DIABETES MELLITUS: CLINICAL CHARACTERISTICS OF 4 SUBTYPES
    Asfandiyarova, N. S.
    TERAPEVTICHESKII ARKHIV, 2011, 83 (10) : 27 - 31
  • [46] Isehemile stroke subtypes distribution in patients with diabetes mellitus type 2
    Yanishevskiy, S.
    Tsygan, N.
    Golokhavstov, S.
    Andreev, R.
    Odinak, M.
    Mirnaya, D.
    CEREBROVASCULAR DISEASES, 2018, 45 : 218 - 218
  • [47] Diabetes Mellitus Disease Prediction and Type Classification Involving Predictive Modeling Using Machine Learning Techniques and Classifiers
    Ahamed, B. Shamreen
    Arya, Meenakshi S.
    Sangeetha, S. K. B.
    Auxilia Osvin, Nancy V.
    APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2022, 2022
  • [48] Preclinical type 2 diabetes mellitus subtypes: new insights into diabetes, depression and dementia
    Zhao, Sijia
    BRAIN, 2025,
  • [49] Voting Classification-Based Diabetes Mellitus Prediction Using Hypertuned Machine-Learning Techniques
    Mushtaq, Zaigham
    Ramzan, Muhammad Farhan
    Ali, Sikandar
    Baseer, Samad
    Samad, Ali
    Husnain, Mujtaba
    MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [50] Performance Analysis of Machine Learning Based On Optimized Feature Selection for Type II Diabetes Mellitus
    Bhat S.S.
    Ansari G.A.
    Ansari M.D.
    Multimedia Tools and Applications, 2025, 84 (8) : 4945 - 4964