Machine learning based study for the classification of Type 2 diabetes mellitus subtypes

被引:2
|
作者
Ordonez-Guillen, Nelson E. [1 ]
Gonzalez-Compean, Jose Luis [1 ]
Lopez-Arevalo, Ivan [1 ]
Contreras-Murillo, Miguel [1 ]
Aldana-Bobadilla, Edwin [2 ]
机构
[1] Cinvestav Tamaulipas, Carretera Victoria Soto Marina km 5-5, Victoria 87130, Tamaulipas, Mexico
[2] CONAHCYT Ctr Invest & Estudios Avanzados IPN, Unidad Tamaulipas, Carretera Victoria Soto Marina km 5-5, Victoria 87130, Tamaulipas, Mexico
关键词
Diabetes; Diabetes subtypes; Data-driven; Classification; HOMEOSTASIS MODEL ASSESSMENT; VALIDATION; SUBGROUPS; PREDICTION; ALGORITHM; SELECTION; GLUCOSE;
D O I
10.1186/s13040-023-00340-2
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Purpose: Data-driven diabetes research has increased its interest in exploring the heterogeneity of the disease, aiming to support in the development of more specific prognoses and treatments within the so-called precision medicine. Recently, one of these studies found five diabetes subgroups with varying risks of complications and treatment responses. Here, we tackle the development and assessment of different models for classifying Type 2 Diabetes (T2DM) subtypes through machine learning approaches, with the aim of providing a performance comparison and new insights on the matter. Methods: We developed a three-stage methodology starting with the preprocessing of public databases NHANES (USA) and ENSANUT (Mexico) to construct a dataset with N = 10,077 adult diabetes patient records. We used N = 2,768 records for training/validation of models and left the remaining (N = 7,309) for testing. In the second stage, groups of observations-each one representing a T2DM subtype- were identified. We tested different clustering techniques and strategies and validated them by using internal and external clustering indices; obtaining two annotated datasets Dset A and Dset B. In the third stage, we developed different classification models assaying four algorithms, seven input-data schemes, and two validation settings on each annotated dataset. We also tested the obtained models using a majority-vote approach for classifying unseen patient records in the hold- out dataset. Results: From the independently obtained bootstrap validation for Dset A and Dset B, mean accuracies across all seven data schemes were 85.3% (+/- 9.2%) and 97.1% (+/- 3.4%), respectively. Best accuracies were 98.8% and 98.9%. Both validation setting results were consistent. For the hold-out dataset, results were consonant with most of those obtained in the literature in terms of class proportions. Conclusion: The development of machine learning systems for the classification of diabetes subtypes constitutes an important task to support physicians for fast and timely decision-making. We expect to deploy this methodology in a data analysis platform to conduct studies for identifying T2DM subtypes in patient records from hospitals.
引用
收藏
页数:37
相关论文
共 50 条
  • [1] Machine learning based study for the classification of Type 2 diabetes mellitus subtypes
    Nelson E. Ordoñez-Guillen
    Jose Luis Gonzalez-Compean
    Ivan Lopez-Arevalo
    Miguel Contreras-Murillo
    Edwin Aldana-Bobadilla
    BioData Mining, 16
  • [2] Type 2 Diabetes Mellitus: Early Detection using Machine Learning Classification
    Gowthami, S.
    Reddy, Venkata Siva
    Ahmed, Mohammed Riyaz
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 1191 - 1198
  • [3] Machine learning-based reproducible prediction of type 2 diabetes subtypes
    Tanabe, Hayato
    Sato, Masahiro
    Miyake, Akimitsu
    Shimajiri, Yoshinori
    Ojima, Takafumi
    Narita, Akira
    Saito, Haruka
    Tanaka, Kenichi
    Masuzaki, Hiroaki
    Kazama, Junichiro J.
    Katagiri, Hideki
    Tamiya, Gen
    Kawakami, Eiryo
    Shimabukuro, Michio
    DIABETOLOGIA, 2024, 67 (11) : 2446 - 2458
  • [4] Using Machine Learning for the Risk Factors Classification of Glycemic Control in Type 2 Diabetes Mellitus
    Cheng, Yi-Ling
    Wu, Ying-Ru
    Lin, Kun-Der
    Lin, Chun-Hung Richard
    Lin, I-Mei
    HEALTHCARE, 2023, 11 (08)
  • [5] Identifying comorbidity-based subtypes of type 2 diabetes: An unsupervised machine learning approach
    Icten, Zeynep
    Menzin, Joseph
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2020, 29 : 330 - 331
  • [6] Classification of Diabetes Mellitus Disease using Machine Learning
    Mohamed, Mahmoud Adnan
    Nassif, Ali Bou
    Al-Shabi, Mohammad
    SMART BIOMEDICAL AND PHYSIOLOGICAL SENSOR TECHNOLOGY XIX, 2022, 12123
  • [7] Pima Indians diabetes mellitus classification based on machine learning (ML) algorithms
    Chang, Victor
    Bailey, Jozeene
    Xu, Qianwen Ariel
    Sun, Zhili
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (22): : 16157 - 16173
  • [8] Pima Indians diabetes mellitus classification based on machine learning (ML) algorithms
    Victor Chang
    Jozeene Bailey
    Qianwen Ariel Xu
    Zhili Sun
    Neural Computing and Applications, 2023, 35 : 16157 - 16173
  • [9] Prediction of Diabetes Mellitus Type-2 Using Machine Learning
    Apoorva, S.
    Aditya, K. S.
    Snigdha, P.
    Darshini, P.
    Sanjay, H. A.
    COMPUTATIONAL VISION AND BIO-INSPIRED COMPUTING, 2020, 1108 : 364 - 370
  • [10] Identification of ferroptosis-related genes in type 2 diabetes mellitus based on machine learning
    Wang, Sen
    Lu, Yongpan
    Chi, Tingting
    Zhang, Yixin
    Zhao, Yuli
    Guo, Huimin
    Feng, Li
    IMMUNITY INFLAMMATION AND DISEASE, 2023, 11 (10)