Crohn's Disease Prediction Using Sequence Based Machine Learning Analysis of Human Microbiome

被引:3
|
作者
Unal, Metehan [1 ]
Bostanci, Erkan [1 ]
Ozkul, Ceren [2 ]
Acici, Koray [3 ]
Asuroglu, Tunc [4 ]
Guzel, Mehmet Serdar [1 ]
机构
[1] Ankara Univ, Dept Comp Engn, TR-06830 Ankara, Turkiye
[2] Hacettepe Univ, Fac Pharm, Dept Pharmaceut Microbiol, TR-06110 Ankara, Turkiye
[3] Ankara Univ, Dept Artificial Intelligence & Data Engn, TR-06830 Ankara, Turkiye
[4] Tampere Univ, Fac Med & Hlth Technol, FI-33720 Tampere, Finland
关键词
microbiota; Machine Learning; bowel disease; bioinformatics; ALGORITHMS;
D O I
10.3390/diagnostics13172835
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Human microbiota refers to the trillions of microorganisms that inhabit our bodies and have been discovered to have a substantial impact on human health and disease. By sampling the microbiota, it is possible to generate massive quantities of data for analysis using Machine Learning algorithms. In this study, we employed several modern Machine Learning techniques to predict Inflammatory Bowel Disease using raw sequence data. The dataset was obtained from NCBI preprocessed graph representations and converted into a structured form. Seven well-known Machine Learning frameworks, including Random Forest, Support Vector Machines, Extreme Gradient Boosting, Light Gradient Boosting Machine, Gaussian Naive Bayes, Logistic Regression, and k-Nearest Neighbor, were used. Grid Search was employed for hyperparameter optimization. The performance of the Machine Learning models was evaluated using various metrics such as accuracy, precision, fscore, kappa, and area under the receiver operating characteristic curve. Additionally, Mc Nemar's test was conducted to assess the statistical significance of the experiment. The data was constructed using k-mer lengths of 3, 4 and 5. The Light Gradient Boosting Machine model overperformed over other models with 67.24%, 74.63% and 76.47% accuracy for k-mer lengths of 3, 4 and 5, respectively. The LightGBM model also demonstrated the best performance in each metric. The study showed promising results predicting disease from raw sequence data. Finally, Mc Nemar's test results found statistically significant differences between different Machine Learning approaches.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Comprehensive data optimization and risk prediction framework: machine learning methods for inflammatory bowel disease prediction based on the human gut microbiome data
    Peng, Yan
    Liu, Yue
    Liu, Yifei
    Wang, Jie
    FRONTIERS IN MICROBIOLOGY, 2024, 15
  • [32] Targeting the microbiome in Crohn's disease
    Borody, Thomas J.
    Dolai, Sibasish
    Gunaratne, Anoja W.
    Clancy, Robert L.
    EXPERT REVIEW OF CLINICAL IMMUNOLOGY, 2022, 18 (09) : 873 - 877
  • [33] Promoter analysis and prediction in the human genome using sequence-based deep learning models
    Umarov, Ramzan
    Kuwahara, Hiroyuki
    Li, Yu
    Gao, Xin
    Solovyev, Victor
    BIOINFORMATICS, 2019, 35 (16) : 2730 - 2737
  • [34] Analysis of diagnostic genes and molecular mechanisms of Crohn's disease and colon cancer based on machine learning algorithms
    Xiao, Jie
    Liang, Junyao
    Zhou, Tao
    Zhou, Man
    Zhang, Dexu
    Feng, Hui
    Tang, Chusen
    Zhou, Qian
    Yang, Weiqing
    Tan, Xiaoqin
    Zhang, Wanjia
    Xu, Yin
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [35] Pediatric Crohn's disease diagnosis aid via genomic analysis and machine learning
    Zheng, Zhiwei
    Zhan, Sha
    Zhou, Yongmao
    Huang, Ganghua
    Chen, Pan
    Li, Baofei
    FRONTIERS IN PEDIATRICS, 2023, 11
  • [36] Prediction of Coronary Heart Disease using Machine Learning: An Experimental Analysis
    Gonsalves, Amanda H.
    Thabtah, Fadi
    Mohammad, Rami Mustafa A.
    Singh, Gurpreet
    ICDLT 2019: 2019 3RD INTERNATIONAL CONFERENCE ON DEEP LEARNING TECHNOLOGIES, 2019, : 51 - 56
  • [37] Analysis and Prediction of Cardio Vascular Disease using Machine Learning Classifiers
    Kumar, N. Komal
    Sindhu, G. Sarika
    Prashanthi, D. Krishna
    Sulthana, A. Shaeen
    2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 15 - 21
  • [38] Machine learning in predicting postoperative complications in Crohn's disease
    Zhang, Li-Fan
    Chen, Liu-Xiang
    Yang, Wen-Juan
    Hu, Bing
    WORLD JOURNAL OF GASTROINTESTINAL SURGERY, 2024, 16 (08):
  • [39] Disease Prediction Based on Symptoms Given by User Using Machine Learning
    Divya A.
    Deepika B.
    Durga Akhila C.H.
    Tonika Devi A.
    Lavanya B.
    Sravya Teja E.
    SN Computer Science, 3 (6)
  • [40] Development of Time-Aggregated Machine Learning Model for Relapse Prediction in Pediatric Crohn's Disease
    Jang, Sooyoung
    Yu, Jaeyong
    Park, Sowon
    Lim, Hyeji
    Koh, Hong
    Park, Yu Rang
    CLINICAL AND TRANSLATIONAL GASTROENTEROLOGY, 2025, 16 (01)