Leptospirosis modelling using hydrometeorological indices and random forest machine learning

被引:0
|
作者
Veianthan Jayaramu
Zed Zulkafli
Simon De Stercke
Wouter Buytaert
Fariq Rahmat
Ribhan Zafira Abdul Rahman
Asnor Juraiza Ishak
Wardah Tahir
Jamalludin Ab Rahman
Nik Mohd Hafiz Mohd Fuzi
机构
[1] Universiti Putra Malaysia,Department of Civil Engineering
[2] Imperial College London,Department of Civil and Environmental Engineering
[3] Universiti Putra Malaysia,Department of Electrical and Electronic Engineering
[4] Universiti Teknologi Mara,Flood Control Research Group, Faculty of Civil Engineering
[5] International Islamic University Malaysia,Department of Community Medicine, Kulliyyah of Medicine
[6] Ministry of Health Malaysia,Kelantan State Health Department
来源
International Journal of Biometeorology | 2023年 / 67卷
关键词
Leptospirosis; Hydrometeorological indices; Cross-correlation analysis; Random forest; Variable importance; Feature selection;
D O I
暂无
中图分类号
学科分类号
摘要
Leptospirosis is a zoonosis that has been linked to hydrometeorological variability. Hydrometeorological averages and extremes have been used before as drivers in the statistical prediction of disease. However, their importance and predictive capacity are still little known. In this study, the use of a random forest classifier was explored to analyze the relative importance of hydrometeorological indices in developing the leptospirosis model and to evaluate the performance of models based on the type of indices used, using case data from three districts in Kelantan, Malaysia, that experience annual monsoonal rainfall and flooding. First, hydrometeorological data including rainfall, streamflow, water level, relative humidity, and temperature were transformed into 164 weekly average and extreme indices in accordance with the Expert Team on Climate Change Detection and Indices (ETCCDI). Then, weekly case occurrences were classified into binary classes “high” and “low” based on an average threshold. Seventeen models based on “average,” “extreme,” and “mixed” indices were trained by optimizing the feature subsets based on the model computed mean decrease Gini (MDG) scores. The variable importance was assessed through cross-correlation analysis and the MDG score. The average and extreme models showed similar prediction accuracy ranges (61.5–76.1% and 72.3–77.0%) while the mixed models showed an improvement (71.7–82.6% prediction accuracy). An extreme model was the most sensitive while an average model was the most specific. The time lag associated with the driving indices agreed with the seasonality of the monsoon. The rainfall variable (extreme) was the most important in classifying the leptospirosis occurrence while streamflow was the least important despite showing higher correlations with leptospirosis.
引用
收藏
页码:423 / 437
页数:14
相关论文
共 50 条
  • [31] Machine learning using random forest to differentiate between blow and fall situations of head trauma
    Temma, Johair
    Nogueira, Luisa
    Santos, Frederic
    Quatrehomme, Gerald
    Bernardi, Caroline
    Alunni, Veronique
    INTERNATIONAL JOURNAL OF LEGAL MEDICINE, 2025,
  • [32] Predicting terrestrial heat flow in Egypt using random forest regression: a machine learning approach
    Ahmed Mohamed Bekhit
    Mohamed Sobh
    Mohamed Abdel Zaher
    Tharwat Abdel Fattah
    Ahmed I. Diab
    Geothermal Energy, 13 (1)
  • [33] AUTOCLASSIFICATION OF THE VARIABLE 3XMM SOURCES USING THE RANDOM FOREST MACHINE LEARNING ALGORITHM
    Farrell, Sean A.
    Murphy, Tara
    Lo, Kitty K.
    ASTROPHYSICAL JOURNAL, 2015, 813 (01):
  • [34] Multifidelity aerodynamic flow field prediction using random forest-based machine learning
    Nagawkar, Jethro
    Leifsson, Leifur
    AEROSPACE SCIENCE AND TECHNOLOGY, 2022, 123
  • [35] A Hybrid Intrusion Detection System for SDWSN using Random Forest (RF) Machine Learning Approach
    Indira, K.
    Sakthi, U.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (02) : 275 - 284
  • [36] A mass appraisal assessment study using machine learning based on multiple regression and random forest
    Yilmazer, Seckin
    Kocaman, Sultan
    LAND USE POLICY, 2020, 99
  • [37] Maturity Prediction in Soybean Breeding Using Aerial Images and the Random Forest Machine Learning Algorithm
    Perez, Osvaldo
    Diers, Brian
    Martin, Nicolas
    REMOTE SENSING, 2024, 16 (23)
  • [38] The Predictive Model of Mental Illness using Decision Tree and Random Forest classification in Machine Learning
    Singh, Prithvipal
    Singh, Gurvinder
    Bharti, Sarveshwar
    2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering, ICACITE 2022, 2022, : 1440 - 1444
  • [39] Software reuse analytics using integrated random forest and gradient boosting machine learning algorithm
    Sandhu, Amandeep Kaur
    Batth, Ranbir Singh
    SOFTWARE-PRACTICE & EXPERIENCE, 2021, 51 (04): : 735 - 747
  • [40] Enhancing skin lesion Classification: A machine learning approach using KNN, XGBoost, and Random Forest
    Hussain, S. K. Rhaber
    Powar, Omkar S.
    2024 CONTROL INSTRUMENTATION SYSTEM CONFERENCE, CISCON 2024, 2024,