Improving river water quality prediction with hybrid machine learning and temporal analysis

被引:1
|
作者
del Castillo, Alberto Fernandez [1 ]
Garibay, Marycarmen Verduzco [1 ]
Diaz-Vazquez, Diego [1 ]
Yebra-Montes, Carlos [2 ]
Brown, Lee E.
Johnson, Andrew [3 ]
Garcia-Gonzalez, Alejandro [4 ]
Gradilla-Hernandez, Misael Sebastian [1 ]
机构
[1] Tecnol Monterrey, Lab Sostenibil & Cambio Climat, Escuela Ingn & Ciencias, Av Gen Ramon Corona 2514, Zapopan 45138, Jalisco, Mexico
[2] Univ Nacl Autonoma Mexico, ENES Leon, Predio Saucillo & Po trero, Blvd UNAM 2011, Leon 37684, Guanajuato, Mexico
[3] Univ Leeds, Sch Geog & Waterleeds, Leeds LS2 9JT, England
[4] Escuela Med & Ciencias Salud, Tecnol Monterrey, Nuevo Mex, CP, Ave Gen Ramon Corona 2514, Zapopan 45138, Jalisco, Mexico
关键词
Water Quality Index; Highly polluted river; Time series analysis; Cluster analysis; Monitoring network; Data Science; TIME-SERIES; INDEX; INFERENCE; CAPACITY; IMPACT;
D O I
10.1016/j.ecoinf.2024.102655
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
River systems provide multiple ecosystem services to society globally, but these are already degraded or threatened in many areas of the world due to water quality issues linked to diffuse and point-source pollutant inputs. Water quality evaluation is essential to develop remediation and management strategies. Computational tools such as machine learning based predictive models have been developed to improve monitoring network capabilities. The model's performance is reduced when datasets composed of reductant information are used for training, on the other hand, the selection of most representative and variable water quality scenarios could result in higher precision. This study analyzed historical water quality behavior in the Santiago River, Mexico, to identify the most variable and representative data available to train machine learning models (Adaptive Neuro Fuzzy Inference System - ANFIS, Artificial Neural Network - ANN, and Support Vector Machine - SVM). Thirteen monitoring sites were clustered according to their water quality variability from 2009 to 2022. Subsequently, a Time Series Analysis (TSA) was used to select the most representative monitoring station from each cluster. Data for 6/13 monitoring sites were retained for the Best Training Subset (BTS) used to train restricted models that performed with similar (ANN and SMV) or higher (ANFIS) prediction accuracy (in terms of RMSE, MAE, MSE and R2) for both training and testing. This study provides evidence of water quality data containing redundant information that is not useful to improve machine learning model performance, in turn leading to overtraining. Combined analytical approaches can maximize the representativeness and variability of data selected for machine learning applications, leading to improved prediction.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Application of machine learning in river water quality management: a review
    Cojbasic, Sanja
    Dmitrasinovic, Sonja
    Kostic, Marija
    Sekulic, Maja Turk
    Radonic, Jelena
    Dodig, Ana
    Stojkovic, Milan
    WATER SCIENCE AND TECHNOLOGY, 2023, 88 (09) : 2297 - 2308
  • [22] Prediction of water quality parameters using machine learning models: a case study of the Karun River, Iran
    Atefeh Nouraki
    Mohammad Alavi
    Mona Golabi
    Mohammad Albaji
    Environmental Science and Pollution Research, 2021, 28 : 57060 - 57072
  • [23] Prediction of water quality parameters using machine learning models: a case study of the Karun River, Iran
    Nouraki, Atefeh
    Alavi, Mohammad
    Golabi, Mona
    Albaji, Mohammad
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2021, 28 (40) : 57060 - 57072
  • [24] Hybrid Horizons: Advancing Water Potability Prediction Through Hybrid Machine Learning
    Biju, Jovita
    Badgujar, Chetan
    Poulose, Alwin
    2024 FIFTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS, ICUFN 2024, 2024, : 175 - 180
  • [25] Spatial and temporal variations in river water quality of the Middle Ganga Basin using unsupervised machine learning techniques
    Krishnaraj, Ashwitha
    Deka, Paresh Chandra
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2020, 192 (12)
  • [26] Spatial and temporal variations in river water quality of the Middle Ganga Basin using unsupervised machine learning techniques
    Ashwitha Krishnaraj
    Paresh Chandra Deka
    Environmental Monitoring and Assessment, 2020, 192
  • [27] River water temperature prediction using hybrid machine learning coupled signal decomposition: EWT versus MODWT
    Heddam, Salim
    Merabet, Khaled
    Difi, Salah
    Kim, Sungwon
    Ptak, Mariusz
    Sojka, Mariusz
    Zounemat-Kermani, Mohammad
    Kisi, Ozgur
    ECOLOGICAL INFORMATICS, 2023, 78
  • [28] Improving urban water demand forecast using conformal prediction-based hybrid machine learning models
    Iwakin, Oluwabunmi
    Moazeni, Faegheh
    JOURNAL OF WATER PROCESS ENGINEERING, 2024, 58
  • [29] A Hybrid Prediction Model for Monitoring of River Water Quality in the USN System
    Kim, Hoontae
    Kim, Minsoo
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2015,
  • [30] Efficient Water Quality Prediction Using Supervised Machine Learning
    Ahmed, Umair
    Mumtaz, Rafia
    Anwar, Hirra
    Shah, Asad A.
    Irfan, Rabia
    Garcia-Nieto, Jose
    WATER, 2019, 11 (11)