Application of machine learning and deep learning methods for hydrated electron rate constant prediction

被引:8
|
作者
Zheng, Shanshan [1 ]
Guo, Wanqian [1 ]
Li, Chao [2 ]
Sun, Yongbin [3 ]
Zhao, Qi [1 ]
Lu, Hao [1 ]
Si, Qishi [1 ]
Wang, Huazhe [1 ]
机构
[1] Harbin Inst Technol, State Key Lab Urban Water Resource & Environm, Harbin 150090, Peoples R China
[2] Northeast Normal Univ, Sch Environm, State Environm Protect Key Lab Wetland Ecol & Vege, 2555 Jingyue St, Changchun 130117, Jilin, Peoples R China
[3] Shandong First Med Univ & Shandong Acad Med Sci, Sch Chem & Pharmaceut Engn, Tai An 271016, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Rate constant prediction; Machine learning; Deep learning; Hydrated electron(e(aq)(-) ); SHAP; Grad-CAM; REDUCTIVE DEFLUORINATION; PERFLUOROOCTANOIC ACID; MODELS; DEGRADATION;
D O I
10.1016/j.envres.2023.115996
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Accurately determining the second-order rate constant with e(aq)(-) (k(eaq-)) for organic compounds (OCs) is crucial in the e(aq)(-) induced advanced reduction processes (ARPs). In this study, we collected 867 k(eaq-) values at different pHs from peer-reviewed publications and applied machine learning (ML) algorithm-XGBoost and deep learning (DL) algorithm-convolutional neural network (CNN) to predict k(eaq-). Our results demonstrated that the CNN model with transfer learning and data augmentation (CNN-TL&DA) greatly improved the prediction results and over-came over-fitting. Furthermore, we compared the ML/DL modeling methods and found that the CNN-TL&DA, which combined molecular images (MI), achieved the best overall performance (R-test(2) = 0.896, RMSEtest = 0.362, MAE(test) = 0.261) when compared to the XGBoost algorithm combined with Mordred descriptors (MD) (0.692, RMSEtest = 0.622, MAE(test) = 0.399) and Morgan fingerprint (MF) (R-test(2) = 0.512, RMSEtest = 0.783, MAE(test )= 0.520). Moreover, the interpretation of the MD-XGBoost and MF-XGBoost models using the SHAP method revealed the significance of MDs (e.g., molecular size, branching, electron distribution, polarizability, and bond types), MFs (e.g, aromatic carbon, carbonyl oxygen, nitrogen, and halogen) and environmental conditions (e.g., pH) that effectively influence the k(eaq-) prediction. The interpretation of the 2D molecular image-CNN (MI-CNN) models using the Grad-CAM method showed that they correctly identified key functional groups such as -CN, -NO2, and -X functional groups that can increase the k(eaq-) values. Additionally, almost all electron-withdrawing groups and a small part of electron-donating groups for the MI-CNN model can be highlighted for estimating k(eaq-). Overall, our results suggest that the CNN approach has smaller errors when compared to ML algorithms, making it a promising candidate for predicting other rate constants.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Application of machine learning methods in multiaxial fatigue life prediction
    Palczynski, Krzysztof
    Skibicki, Dariusz
    Pejkowski, Lukasz
    Andrysiak, Tomasz
    FATIGUE & FRACTURE OF ENGINEERING MATERIALS & STRUCTURES, 2023, 46 (02) : 416 - 432
  • [22] Application of machine learning methods to forecast the rate of horizontal wells
    Soromotin, A., V
    Martyushev, D. A.
    Stepanenko, I. B.
    SOCAR PROCEEDINGS, 2023, : 70 - 77
  • [23] Comparison of Tree Based Ensemble Machine Learning Methods for Prediction of Rate Constant of Diels-Alder Reaction
    Dev, Vikrant A.
    Datta, Shounak
    Chemmangattuvalappil, Nishanth G.
    Eden, Mario R.
    27TH EUROPEAN SYMPOSIUM ON COMPUTER AIDED PROCESS ENGINEERING, PT A, 2017, 40A : 997 - 1002
  • [24] Machine Learning and Deep Learning Methods in Mining Operations: a Data-Driven SAG Mill Energy Consumption Prediction Application
    Sebastian Avalos
    Willy Kracht
    Julian M. Ortiz
    Mining, Metallurgy & Exploration, 2020, 37 : 1197 - 1212
  • [25] Machine Learning and Deep Learning Methods in Mining Operations: a Data-Driven SAG Mill Energy Consumption Prediction Application
    Avalos, Sebastian
    Kracht, Willy
    Ortiz, Julian M.
    MINING METALLURGY & EXPLORATION, 2020, 37 (04) : 1197 - 1212
  • [26] Load Forecasting with Machine Learning and Deep Learning Methods
    Cordeiro-Costas, Moises
    Villanueva, Daniel
    Eguia-Oller, Pablo
    Martinez-Comesana, Miguel
    Ramos, Sergio
    APPLIED SCIENCES-BASEL, 2023, 13 (13):
  • [27] Machine Learning and Deep Learning for Loan Prediction in Banking: Exploring Ensemble Methods and Data Balancing
    Sayed, Eslam Hussein
    Alabrah, Amerah
    Rahouma, Kamel Hussein
    Zohaib, Muhammad
    Badry, Rasha M.
    IEEE ACCESS, 2024, 12 : 193997 - 194019
  • [28] In silico prediction of drug-induced ototoxicity using machine learning and deep learning methods
    Huang, Xin
    Tang, Fang
    Hua, Yuqing
    Li, Xiao
    CHEMICAL BIOLOGY & DRUG DESIGN, 2021, 98 (02) : 248 - 257
  • [29] Ship trajectory prediction based on machine learning and deep learning: A systematic review and methods analysis
    Li, Huanhuan
    Jiao, Hang
    Yang, Zaili
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [30] Prediction of Abrasive Weight Wear Rate Using Machine Learning Methods
    Kalentiev, E. A.
    Tarasov, V. V.
    Lokhanina, S. Yu.
    MECHANICS, RESOURCE AND DIAGNOSTICS OF MATERIALS AND STRUCTURES (MRDMS-2019), 2019, 2176