Predicting few disinfection byproducts in the water distribution systems using machine learning models

被引:0
|
作者
Shakhawat Chowdhury [1 ]
Karim Asif Sattar [4 ]
Syed Masiur Rahman [2 ]
机构
[1] King Fahd University of Petroleum & Minerals,Department of Civil and Environmental Engineering
[2] Research Engineer I,undefined
[3] Interdisciplinary Research Center for Smart Mobility and Logistics. King Fahd University of Petroleum & Minerals,undefined
[4] Research Engineer I,undefined
[5] Applied Research Center for Environment & Marine Studies,undefined
[6] Research Institute,undefined
[7] King Fahd University of Petroleum & Minerals,undefined
[8] IRC CBM,undefined
[9] King Fahd University of Petroleum & Minerals,undefined
关键词
Machine learning models; Drinking water; Water distribution system; Disinfection byproducts; Model training and testing; Risk reduction;
D O I
10.1007/s11356-025-35933-3
中图分类号
学科分类号
摘要
Concerns regarding disinfection byproducts (DBPs) in drinking water persist, with measurements in water treatment plants (WTPs) being relatively easier than those in water distribution systems (WDSs) due to accessibility challenges, especially during adverse weather conditions. Machine learning (ML) models offer improved predictions of DBPs in WDSs. This study developed multiple ML models to predict Trihalomethanes (THMs), Haloacetic Acids (HAAs), Dichloroacetonitrile (DCAN), and N-nitrosodimethylamine (NDMA) in WDSs using data collected over 13 years (2008–2020) from 113 water supply systems (WSS) in Ontario. Data were collected tri-monthly (four times/year) following Ontario's regulatory requirements. Four common ML models—linear regressor (LR), random forest regressor (RFR), support vector regressor (SVR), and artificial neural networks with multiple folds cross-validation (ANN-MV) and single fold validation (ANN-SV)—were trained and tested using different datasets. R2 values for training datasets of THMs, HAAs, DCAN, and NDMA models ranged from 0.533 to 0.976, 0.560 to 0.980, 0.602 to 0.993, and 0.449 to 0.858, respectively. For testing datasets, R2 ranged from 0.517 to 0.939, 0.437 to 0.945, 0.565 to 0.973, and 0.517 to 0.718, respectively. Among THMs, HAAs, and DCAN, ANN-SV models were identified as the best, followed by the RFR model, whereas for NDMA, SVR was the superior model, followed by the LR model. Some models reliably predicted DBPs, suggesting they could replace costly sampling and experimental analysis for DBPs in the WDSs, thereby enhancing DBPs control in WDSs and reducing human exposure and associated risks.
引用
收藏
页码:3776 / 3794
页数:18
相关论文
共 50 条
  • [31] PREDICTING HEALTHCARE COSTS OF DIABETES USING MACHINE LEARNING MODELS
    Gonzalez Rodriguez, J.
    Pinzon Espitia, O. L.
    Franco, C.
    Augusto, V
    VALUE IN HEALTH, 2019, 22 : S575 - S575
  • [33] Predicting maternal risk level using machine learning models
    Al Mashrafi, Sulaiman Salim
    Tafakori, Laleh
    Abdollahian, Mali
    BMC PREGNANCY AND CHILDBIRTH, 2024, 24 (01)
  • [34] Comparison of Predicting Regional Mortalities Using Machine Learning Models
    Caglar, Oguzhan
    Ozen, Figen
    ARTIFICIAL INTELLIGENCE FOR INTERNET OF THINGS (IOT) AND HEALTH SYSTEMS OPERABILITY, IOTHIC 2023, 2024, 8 : 59 - 72
  • [35] Predicting brain tumor presence using machine learning models
    Huang, Weiguo
    Dai, Zhenhua
    MULTISCALE AND MULTIDISCIPLINARY MODELING EXPERIMENTS AND DESIGN, 2025, 8 (01)
  • [36] Predicting Web Survey Breakoffs Using Machine Learning Models
    Chen, Zeming
    Cernat, Alexandru
    Shlomo, Natalie
    SOCIAL SCIENCE COMPUTER REVIEW, 2023, 41 (02) : 573 - 591
  • [37] Predicting Promoters in Phage Genomes Using Machine Learning Models
    Sampaio, Marta
    Rocha, Miguel
    Oliveira, Hugo
    Dias, Oscar
    PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 1005 : 105 - 112
  • [38] Predicting Recidivism to Drug Distribution using Machine Learning Techniques
    Butsara, Nuttawit
    Athonthitichot, Panchan
    Jodpimai, Pichai
    2019 17TH INTERNATIONAL CONFERENCE ON ICT AND KNOWLEDGE ENGINEERING (ICT&KE), 2019, : 165 - 169
  • [39] Predicting customer satisfaction for distribution companies using machine learning
    Cavalcante Siebert, Luciano
    Bianchi Filho, Jose Francisco
    da Silva Junior, Eunelson Jose
    Yamakawa, Eduardo Kazumi
    Catapan, Angela
    INTERNATIONAL JOURNAL OF ENERGY SECTOR MANAGEMENT, 2021, 15 (04) : 743 - 764
  • [40] Effects of plumbing systems on human exposure to disinfection byproducts in water: a case study
    Chowdhury, Shakhawat
    JOURNAL OF WATER AND HEALTH, 2016, 14 (03) : 489 - 503