Explainable machine learning models for estimating daily dissolved oxygen concentration of the Tualatin River

被引:4
|
作者
Li, Shuguang [1 ]
Qasem, Sultan Noman [2 ,3 ]
Band, Shahab S. [4 ,5 ,9 ,10 ]
Ameri, Rasoul [6 ]
Pai, Hao-Ting [7 ,11 ]
Mehdizadeh, Saeid [8 ]
机构
[1] Shandong Technol & Business Univ, Sch Comp Sci & Technol, Yantai, Peoples R China
[2] Imam Mohammad Ibn Saud Islamic Univ IMSIU, Coll Comp & Informat Sci, Comp Sci Dept, Riyadh, Saudi Arabia
[3] Taiz Univ, Fac Appl Sci, Comp Sci Dept, Taizi, Yemen
[4] Natl Yunlin Univ Sci & Technol, Future Technol Res Ctr, Touliu, Taiwan
[5] Natl Yunlin Univ Sci & Technol, Int Grad Sch Artificial Intelligence, Dept Informat Management, Touliu, Taiwan
[6] Natl Yunlin Univ Sci & Technol, Dept Informat Management, Touliu, Taiwan
[7] Natl Pingtung Univ, Big Data Applicat Business, Pingtung, Taiwan
[8] Urmia Univ, Water Engn Dept, Orumiyeh, Iran
[9] Natl Yunlin Univ Sci & Technol, Future Technol Res Ctr, 123 Univ Rd,Sect 3, Touliu 64002, Yunlin, Taiwan
[10] Natl Yunlin Univ Sci & Technol, Int Grad Sch Artificial Intelligence, Dept Informat Management, 123 Univ Rd,Sect 3, Touliu 64002, Yunlin, Taiwan
[11] Natl Pingtung Univ, Bachelor Program Big Data Applicat Business, 51 Minsheng E Rd, Pingtung 900392, Pingtung, Taiwan
关键词
Explainable machine learning; dissolved oxygen concentration; estimation; SHapley additive explanations; SUPPORT VECTOR MACHINE; WATER-QUALITY; PREDICTION; REGRESSION;
D O I
10.1080/19942060.2024.2304094
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Monitoring the quality of river water is of fundamental importance and needs to be taken into consideration when it comes to the research into the hydrological field. In this context, the concentration of the dissolved oxygen (DO) is one of the most significant indicators of the quality of river water. The current study aimed to estimate the minimum, maximum, and mean DO concentrations (DO min, DO max, DO mean) at a gauging station located on Tualatin River, United States. To that end, four machine learning models, such as support vector regression (SVR), multi-layer perceptron (MLP), random forest (RF), and gradient boosting (GB) were established. Root mean square error (RMSE), mean absolute error (MAE), coefficient of correlation (R), and Nash-Sutcliffe efficiency (NSE) metrics were employed to better assess the accuracies of these models. The modeling results demonstrated that the SVR and MLP surpassed the RF and GB models. Despite this, the SVR was concluded to be the best-performing method when used to estimate the DO min, DO max, and DO mean. The best error statistics in the testing phase were related to the SVR model with full (four) inputs to estimate DO mean concentration (RMSE = 0.663 mg/l, MAE = 0.508 mg/l, R = 0.945, NSE = 0.875). Finally, the explainability of the superior models (i.e. SVR models) was conducted using SHapley Additive exPlanations (SHAP) for the first time to estimate DO concentration. In fact, evaluating the explainability of machine learning models can provide useful information about the impact of each of the input estimators used in the procedure of models development. It was concluded that the specific conductance (SC) and followed by water temperature (WT) could provide the most contributions for estimating the DO min, DO max, and DO mean concentrations.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Hybrid machine learning models for prediction of daily dissolved oxygen
    Azma, Aliasghar
    Liu, Yakun
    Azma, Masoumeh
    Saadat, Mohsen
    Zhang, Di
    Cho, Jinwoo
    Rezania, Shahabaldin
    JOURNAL OF WATER PROCESS ENGINEERING, 2023, 54
  • [2] Dissolved oxygen forecasting in the Mississippi River: advanced ensemble machine learning models
    Granata, Francesco
    Zhu, Senlin
    Di Nunno, Fabio
    ENVIRONMENTAL SCIENCE-ADVANCES, 2024, 3 (11):
  • [3] Estimating Total Dissolved Solids in Groundwater Using Machine Learning Models
    Gulati, Sumita
    Bansal, Anshul
    Pal, Ashok
    NATURAL RESOURCES RESEARCH, 2025, : 1623 - 1644
  • [4] Urban River Dissolved Oxygen Prediction Model Using Machine Learning
    Moon, Juhwan
    Lee, Jaejoon
    Lee, Sangwon
    Yun, Hongsik
    WATER, 2022, 14 (12)
  • [5] Concentration estimation of dissolved oxygen in Pearl River Basin using input variable selection and machine learning techniques
    Li, Wenjing
    Fang, Huaiyang
    Qin, Guangxiong
    Tan, Xiuqin
    Huang, Zhiwei
    Zeng, Fantang
    Du, Hongwei
    Li, Shuping
    SCIENCE OF THE TOTAL ENVIRONMENT, 2020, 731
  • [6] Hybrid Machine Learning Ensemble Techniques for Modeling Dissolved Oxygen Concentration
    Abba, Sani Isah
    Linh, Nguyen Thi Thuy
    Abdullahi, Jazuli
    Ali, Shaban Ismael Albrka
    Pham, Quoc Bao
    Abdulkadir, Rabiu Aliyu
    Costache, Romulus
    Nam, Van Thai
    Anh, Duong Tran
    IEEE ACCESS, 2020, 8 : 157218 - 157237
  • [7] Explainable machine learning models with privacy
    Bozorgpanah, Aso
    Torra, Vicenc
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2024, 13 (01) : 31 - 50
  • [8] Explainable machine learning models with privacy
    Aso Bozorgpanah
    Vicenç Torra
    Progress in Artificial Intelligence, 2024, 13 : 31 - 50
  • [9] Machine learning models to predict nitrate concentration in a river basin
    Dorado-Guerra, Diana Yaritza
    Corzo-Perez, Gerald
    Paredes-Arquiola, Javier
    Perez-Martin, Miguel Angel
    ENVIRONMENTAL RESEARCH COMMUNICATIONS, 2022, 4 (12):
  • [10] Estimating Road Construction Costs with Explainable Machine Learning
    Larocque, Rosanne
    Boule, Anne-Marie
    Cappart, Quentin
    INFORMS JOURNAL ON APPLIED ANALYTICS, 2024,