Explainable machine learning models for estimating daily dissolved oxygen concentration of the Tualatin River

被引:4
|
作者
Li, Shuguang [1 ]
Qasem, Sultan Noman [2 ,3 ]
Band, Shahab S. [4 ,5 ,9 ,10 ]
Ameri, Rasoul [6 ]
Pai, Hao-Ting [7 ,11 ]
Mehdizadeh, Saeid [8 ]
机构
[1] Shandong Technol & Business Univ, Sch Comp Sci & Technol, Yantai, Peoples R China
[2] Imam Mohammad Ibn Saud Islamic Univ IMSIU, Coll Comp & Informat Sci, Comp Sci Dept, Riyadh, Saudi Arabia
[3] Taiz Univ, Fac Appl Sci, Comp Sci Dept, Taizi, Yemen
[4] Natl Yunlin Univ Sci & Technol, Future Technol Res Ctr, Touliu, Taiwan
[5] Natl Yunlin Univ Sci & Technol, Int Grad Sch Artificial Intelligence, Dept Informat Management, Touliu, Taiwan
[6] Natl Yunlin Univ Sci & Technol, Dept Informat Management, Touliu, Taiwan
[7] Natl Pingtung Univ, Big Data Applicat Business, Pingtung, Taiwan
[8] Urmia Univ, Water Engn Dept, Orumiyeh, Iran
[9] Natl Yunlin Univ Sci & Technol, Future Technol Res Ctr, 123 Univ Rd,Sect 3, Touliu 64002, Yunlin, Taiwan
[10] Natl Yunlin Univ Sci & Technol, Int Grad Sch Artificial Intelligence, Dept Informat Management, 123 Univ Rd,Sect 3, Touliu 64002, Yunlin, Taiwan
[11] Natl Pingtung Univ, Bachelor Program Big Data Applicat Business, 51 Minsheng E Rd, Pingtung 900392, Pingtung, Taiwan
关键词
Explainable machine learning; dissolved oxygen concentration; estimation; SHapley additive explanations; SUPPORT VECTOR MACHINE; WATER-QUALITY; PREDICTION; REGRESSION;
D O I
10.1080/19942060.2024.2304094
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Monitoring the quality of river water is of fundamental importance and needs to be taken into consideration when it comes to the research into the hydrological field. In this context, the concentration of the dissolved oxygen (DO) is one of the most significant indicators of the quality of river water. The current study aimed to estimate the minimum, maximum, and mean DO concentrations (DO min, DO max, DO mean) at a gauging station located on Tualatin River, United States. To that end, four machine learning models, such as support vector regression (SVR), multi-layer perceptron (MLP), random forest (RF), and gradient boosting (GB) were established. Root mean square error (RMSE), mean absolute error (MAE), coefficient of correlation (R), and Nash-Sutcliffe efficiency (NSE) metrics were employed to better assess the accuracies of these models. The modeling results demonstrated that the SVR and MLP surpassed the RF and GB models. Despite this, the SVR was concluded to be the best-performing method when used to estimate the DO min, DO max, and DO mean. The best error statistics in the testing phase were related to the SVR model with full (four) inputs to estimate DO mean concentration (RMSE = 0.663 mg/l, MAE = 0.508 mg/l, R = 0.945, NSE = 0.875). Finally, the explainability of the superior models (i.e. SVR models) was conducted using SHapley Additive exPlanations (SHAP) for the first time to estimate DO concentration. In fact, evaluating the explainability of machine learning models can provide useful information about the impact of each of the input estimators used in the procedure of models development. It was concluded that the specific conductance (SC) and followed by water temperature (WT) could provide the most contributions for estimating the DO min, DO max, and DO mean concentrations.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Machine Learning-based Dissolved Oxygen Prediction Modeling and Evaluation in the Yangtze River Estuary
    Li, Xiao-Ying
    Wang, Hua
    Wang, Yi-Qing
    Zhang, Liang-Jing
    Wu, Yi
    Huanjing Kexue/Environmental Science, 2024, 45 (12): : 7123 - 7133
  • [32] Explainable Machine Learning Models for Swahili News Classification
    Murindanyi, Sudi
    Brian, Yiiki Afedra
    Katumba, Andrew
    Nakatumba-Nabende, Joyce
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2023, 2023, : 12 - 18
  • [33] The coming of age of interpretable and explainable machine learning models
    Lisboa, P. J. G.
    Saralajew, S.
    Vellido, A.
    Fernandez-Domenech, R.
    Villmann, T.
    NEUROCOMPUTING, 2023, 535 : 25 - 39
  • [34] Explainable machine learning models for Medicare fraud detection
    Hancock, John T.
    Bauder, Richard A.
    Wang, Huanjing
    Khoshgoftaar, Taghi M.
    JOURNAL OF BIG DATA, 2023, 10 (01)
  • [35] Modelling the diurnal variation of dissolved oxygen concentration in the River Oona
    Shi, J
    Douglas, R
    Rippey, B
    Jordan, P
    WATER POLLUTION VII: MODELLING, MEASURING AND PREDICTION, 2003, 9 : 403 - 412
  • [36] An Explainable Deep Learning Model for Daily Sea Ice Concentration Forecast
    Li, Yang
    Qiu, Yubao
    Jia, Guoqiang
    Yu, Shuwen
    Zhang, Yixiao
    Huang, Lin
    Lepparanta, Matti
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 17
  • [37] Assessing the performance of a suite of machine learning models for daily river water temperature prediction
    Zhu, Senlin
    Nyarko, Emmanuel Karlo
    Hadzima-Nyarko, Marijana
    Heddam, Salim
    Wu, Shiqiang
    PEERJ, 2019, 7
  • [38] New Graph-Based and Transformer Deep Learning Models for River Dissolved Oxygen Forecasting
    Rocha, Paulo Alexandre Costa
    Santos, Victor Oliveira
    The, Jesse Van Griensven
    Gharabaghi, Bahram
    ENVIRONMENTS, 2023, 10 (12)
  • [39] Development of Boosted Machine Learning Models for Estimating Daily Reference Evapotranspiration and Comparison with Empirical Approaches
    Mehdizadeh, Saeid
    Mohammadi, Babak
    Quoc Bao Pham
    Duan, Zheng
    WATER, 2021, 13 (24)
  • [40] Multi-step ahead dissolved oxygen concentration prediction based on knowledge guided ensemble learning and explainable artificial intelligence
    Wu, Junhao
    Wang, Zhaocai
    Dong, Jinghan
    Yao, Zhiyuan
    Chen, Xi
    Fan, Heshan
    JOURNAL OF HYDROLOGY, 2024, 636