Leptospirosis modelling using hydrometeorological indices and random forest machine learning

被引：0

作者：

Veianthan Jayaramu

Zed Zulkafli

Simon De Stercke

Wouter Buytaert

Fariq Rahmat

Ribhan Zafira Abdul Rahman

Asnor Juraiza Ishak

Wardah Tahir

Jamalludin Ab Rahman

Nik Mohd Hafiz Mohd Fuzi

机构：

[1] Universiti Putra Malaysia,Department of Civil Engineering

[2] Imperial College London,Department of Civil and Environmental Engineering

[3] Universiti Putra Malaysia,Department of Electrical and Electronic Engineering

[4] Universiti Teknologi Mara,Flood Control Research Group, Faculty of Civil Engineering

[5] International Islamic University Malaysia,Department of Community Medicine, Kulliyyah of Medicine

[6] Ministry of Health Malaysia,Kelantan State Health Department

来源：

International Journal of Biometeorology | 2023年 / 67卷

关键词：

Leptospirosis; Hydrometeorological indices; Cross-correlation analysis; Random forest; Variable importance; Feature selection;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Leptospirosis is a zoonosis that has been linked to hydrometeorological variability. Hydrometeorological averages and extremes have been used before as drivers in the statistical prediction of disease. However, their importance and predictive capacity are still little known. In this study, the use of a random forest classifier was explored to analyze the relative importance of hydrometeorological indices in developing the leptospirosis model and to evaluate the performance of models based on the type of indices used, using case data from three districts in Kelantan, Malaysia, that experience annual monsoonal rainfall and flooding. First, hydrometeorological data including rainfall, streamflow, water level, relative humidity, and temperature were transformed into 164 weekly average and extreme indices in accordance with the Expert Team on Climate Change Detection and Indices (ETCCDI). Then, weekly case occurrences were classified into binary classes “high” and “low” based on an average threshold. Seventeen models based on “average,” “extreme,” and “mixed” indices were trained by optimizing the feature subsets based on the model computed mean decrease Gini (MDG) scores. The variable importance was assessed through cross-correlation analysis and the MDG score. The average and extreme models showed similar prediction accuracy ranges (61.5–76.1% and 72.3–77.0%) while the mixed models showed an improvement (71.7–82.6% prediction accuracy). An extreme model was the most sensitive while an average model was the most specific. The time lag associated with the driving indices agreed with the seasonality of the monsoon. The rainfall variable (extreme) was the most important in classifying the leptospirosis occurrence while streamflow was the least important despite showing higher correlations with leptospirosis.

引用

页码：423 / 437

页数：14

共 50 条

[41] Machine Learning Random Forest Cluster Analysis for Large Overfitting Data: using R Programming
Rimal, Yagyanath
PROCEEDINGS OF THE 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2019, : 1265 - 1271
[42] Network Intrusion Detection System Using Random Forest and Decision Tree Machine Learning Techniques
Bhavani, T. Tulasi
Rao, M. Kameswara
Reddy, A. Manohar
FIRST INTERNATIONAL CONFERENCE ON SUSTAINABLE TECHNOLOGIES FOR COMPUTATIONAL INTELLIGENCE, 2020, 1045 : 637 - 643
[43] DIRECT ESTIMATION OF ECOSYSTEM WATER USE EFFICIENCY USING THE RANDOM FOREST MACHINE LEARNING MODEL
Sun, Yifei
Huang, Lingxiao
Wang, Junrui
Liu, Meng
Di, Suchuang
Yang, Simin
Zhang, Hang
Zhang, Cen
Tang, Ronglin
2024 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2024), 2024, : 10550 - 10553
[44] National classification of surface-groundwater interaction using random forest machine learning technique
Yang, Jing
Griffiths, James
Zammit, Christian
RIVER RESEARCH AND APPLICATIONS, 2019, 35 (07) : 932 - 943
[45] Using machine learning for assigning indices to textual cases
Bruninghaus, S
Ashley, KD
CASE-BASED REASONING RESEARCH AND DEVELOPMENT, 1997, 1266 : 303 - 314
[46] Modelling bluetongue and African horse sickness vector (Culicoides spp.) distribution in the Western Cape in South Africa using random forest machine learning
de Klerk, Joanna
Tildesley, Michael
Labuschagne, Karien
Gorsich, Erin
PARASITES & VECTORS, 2024, 17 (01):
[47] Machine learning random forest for predicting oncosomatic variant NGS analysis
Pellegrino, Eric
Jacques, Coralie
Beaufils, Nathalie
Nanni, Isabelle
Carlioz, Antoine
Metellus, Philippe
Ouafik, L'Houcine
SCIENTIFIC REPORTS, 2021, 11 (01)
[48] Probabilistic Random Forest: A Machine Learning Algorithm for Noisy Data Sets
Reis, Itamar
Baron, Dalya
Shahaf, Sahar
ASTRONOMICAL JOURNAL, 2019, 157 (01):
[49] A Random Forest Machine Learning Approach for the Identification and Quantification of Erosive Events
Vergni, Lorenzo
Todisco, Francesca
WATER, 2023, 15 (12)
[50] Machine learning model for random forest acute oral toxicity prediction
Elsayad, A. M.
Elsayad, K. A.
Zeghid, M.
Khan, A. N.
Baareh, A. K. M.
Sadiq, A.
Mukhtar, S. A.
Ali, H. F.
Abd El-kade, S.
GLOBAL JOURNAL OF ENVIRONMENTAL SCIENCE AND MANAGEMENT-GJESM, 2025, 11 (01): : 21 - 38

← 1 2 3 4 5 →