Improving river water quality prediction with hybrid machine learning and temporal analysis

被引:1
|
作者
del Castillo, Alberto Fernandez [1 ]
Garibay, Marycarmen Verduzco [1 ]
Diaz-Vazquez, Diego [1 ]
Yebra-Montes, Carlos [2 ]
Brown, Lee E.
Johnson, Andrew [3 ]
Garcia-Gonzalez, Alejandro [4 ]
Gradilla-Hernandez, Misael Sebastian [1 ]
机构
[1] Tecnol Monterrey, Lab Sostenibil & Cambio Climat, Escuela Ingn & Ciencias, Av Gen Ramon Corona 2514, Zapopan 45138, Jalisco, Mexico
[2] Univ Nacl Autonoma Mexico, ENES Leon, Predio Saucillo & Po trero, Blvd UNAM 2011, Leon 37684, Guanajuato, Mexico
[3] Univ Leeds, Sch Geog & Waterleeds, Leeds LS2 9JT, England
[4] Escuela Med & Ciencias Salud, Tecnol Monterrey, Nuevo Mex, CP, Ave Gen Ramon Corona 2514, Zapopan 45138, Jalisco, Mexico
关键词
Water Quality Index; Highly polluted river; Time series analysis; Cluster analysis; Monitoring network; Data Science; TIME-SERIES; INDEX; INFERENCE; CAPACITY; IMPACT;
D O I
10.1016/j.ecoinf.2024.102655
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
River systems provide multiple ecosystem services to society globally, but these are already degraded or threatened in many areas of the world due to water quality issues linked to diffuse and point-source pollutant inputs. Water quality evaluation is essential to develop remediation and management strategies. Computational tools such as machine learning based predictive models have been developed to improve monitoring network capabilities. The model's performance is reduced when datasets composed of reductant information are used for training, on the other hand, the selection of most representative and variable water quality scenarios could result in higher precision. This study analyzed historical water quality behavior in the Santiago River, Mexico, to identify the most variable and representative data available to train machine learning models (Adaptive Neuro Fuzzy Inference System - ANFIS, Artificial Neural Network - ANN, and Support Vector Machine - SVM). Thirteen monitoring sites were clustered according to their water quality variability from 2009 to 2022. Subsequently, a Time Series Analysis (TSA) was used to select the most representative monitoring station from each cluster. Data for 6/13 monitoring sites were retained for the Best Training Subset (BTS) used to train restricted models that performed with similar (ANN and SMV) or higher (ANFIS) prediction accuracy (in terms of RMSE, MAE, MSE and R2) for both training and testing. This study provides evidence of water quality data containing redundant information that is not useful to improve machine learning model performance, in turn leading to overtraining. Combined analytical approaches can maximize the representativeness and variability of data selected for machine learning applications, leading to improved prediction.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Improving prediction of water quality indices using novel hybrid machine -learning algorithms
    Duie Tien Bui
    Khosravi, Khabat
    Tiefenbacher, John
    Nguyen, Hoang
    Kazakis, Nerantzis
    SCIENCE OF THE TOTAL ENVIRONMENT, 2020, 721
  • [2] River Water Salinity Prediction Using Hybrid Machine Learning Models
    Melesse, Assefa M.
    Khosravi, Khabat
    Tiefenbacher, John P.
    Heddam, Salim
    Kim, Sungwon
    Mosavi, Amir
    Pham, Binh Thai
    WATER, 2020, 12 (10) : 1 - 21
  • [3] A Machine Learning Based Method for Improving the Performance of Water Quality Prediction
    Huu Du Nguyen
    Kim Khanh Hoang
    Thai Duong Nguyen
    Dao Minh Hoang
    Tran Ngoc Thang
    INTELLIGENCE OF THINGS: TECHNOLOGIES AND APPLICATIONS, ICIT 2024, VOL 2, 2025, 230 : 198 - 207
  • [4] River water quality index prediction and uncertainty analysis: A comparative study of machine learning models
    Asadollah, Seyed Babak Haji Seyed
    Sharafati, Ahmad
    Motta, Davide
    Yaseen, Zaher Mundher
    JOURNAL OF ENVIRONMENTAL CHEMICAL ENGINEERING, 2021, 9 (01):
  • [5] Temporal Prediction of Coastal Water Quality Based on Environmental Factors with Machine Learning
    Lin, Junan
    Liu, Qianqian
    Song, Yang
    Liu, Jiting
    Yin, Yixue
    Hall, Nathan S.
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (08)
  • [6] Multiple Machine Learning Methods with Correlation Analysis for Short-Term River Water Quality Prediction
    Chen, Ming
    Liu, Guanliang
    Lv, Ting
    ADVANCES IN SWARM INTELLIGENCE, PT II, ICSI 2024, 2024, 14789 : 88 - 98
  • [7] Coastal water quality prediction based on machine learning with feature interpretation and spatio-temporal analysis
    Grbcic, Luka
    Druzeta, Sinisa
    Mausa, Goran
    Lipic, Tomislav
    Lusic, Darija Vukic
    Alvir, Marta
    Lucin, Ivana
    Sikirica, Ante
    Davidovic, Davor
    Travas, Vanja
    Kalafatovic, Daniela
    Pikelj, Kristina
    Fajkovic, Hana
    Holjevic, Toni
    Kranjcevic, Lado
    ENVIRONMENTAL MODELLING & SOFTWARE, 2022, 155
  • [8] The Development of a River Quality Prediction Model That Is Based on the Water Quality Index via Machine Learning: A Review
    Shaheed, Hassan
    Zawawi, Mohd Hafiz
    Hayder, Gasim
    PROCESSES, 2025, 13 (03)
  • [10] Prediction of Water Quality Classification of the Kelantan River Basin, Malaysia, Using Machine Learning Techniques
    Malek, Nur Hanisah Abdul
    Yaacob, Wan Fairos Wan
    Nasir, Syerina Azlin Md
    Shaadan, Norshahida
    WATER, 2022, 14 (07)