Application of machine learning and deep learning methods for hydrated electron rate constant prediction

被引:8
|
作者
Zheng, Shanshan [1 ]
Guo, Wanqian [1 ]
Li, Chao [2 ]
Sun, Yongbin [3 ]
Zhao, Qi [1 ]
Lu, Hao [1 ]
Si, Qishi [1 ]
Wang, Huazhe [1 ]
机构
[1] Harbin Inst Technol, State Key Lab Urban Water Resource & Environm, Harbin 150090, Peoples R China
[2] Northeast Normal Univ, Sch Environm, State Environm Protect Key Lab Wetland Ecol & Vege, 2555 Jingyue St, Changchun 130117, Jilin, Peoples R China
[3] Shandong First Med Univ & Shandong Acad Med Sci, Sch Chem & Pharmaceut Engn, Tai An 271016, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Rate constant prediction; Machine learning; Deep learning; Hydrated electron(e(aq)(-) ); SHAP; Grad-CAM; REDUCTIVE DEFLUORINATION; PERFLUOROOCTANOIC ACID; MODELS; DEGRADATION;
D O I
10.1016/j.envres.2023.115996
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Accurately determining the second-order rate constant with e(aq)(-) (k(eaq-)) for organic compounds (OCs) is crucial in the e(aq)(-) induced advanced reduction processes (ARPs). In this study, we collected 867 k(eaq-) values at different pHs from peer-reviewed publications and applied machine learning (ML) algorithm-XGBoost and deep learning (DL) algorithm-convolutional neural network (CNN) to predict k(eaq-). Our results demonstrated that the CNN model with transfer learning and data augmentation (CNN-TL&DA) greatly improved the prediction results and over-came over-fitting. Furthermore, we compared the ML/DL modeling methods and found that the CNN-TL&DA, which combined molecular images (MI), achieved the best overall performance (R-test(2) = 0.896, RMSEtest = 0.362, MAE(test) = 0.261) when compared to the XGBoost algorithm combined with Mordred descriptors (MD) (0.692, RMSEtest = 0.622, MAE(test) = 0.399) and Morgan fingerprint (MF) (R-test(2) = 0.512, RMSEtest = 0.783, MAE(test )= 0.520). Moreover, the interpretation of the MD-XGBoost and MF-XGBoost models using the SHAP method revealed the significance of MDs (e.g., molecular size, branching, electron distribution, polarizability, and bond types), MFs (e.g, aromatic carbon, carbonyl oxygen, nitrogen, and halogen) and environmental conditions (e.g., pH) that effectively influence the k(eaq-) prediction. The interpretation of the 2D molecular image-CNN (MI-CNN) models using the Grad-CAM method showed that they correctly identified key functional groups such as -CN, -NO2, and -X functional groups that can increase the k(eaq-) values. Additionally, almost all electron-withdrawing groups and a small part of electron-donating groups for the MI-CNN model can be highlighted for estimating k(eaq-). Overall, our results suggest that the CNN approach has smaller errors when compared to ML algorithms, making it a promising candidate for predicting other rate constants.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Dropout prediction in Moocs using deep learning and machine learning
    Ram B. Basnet
    Clayton Johnson
    Tenzin Doleck
    Education and Information Technologies, 2022, 27 : 11499 - 11513
  • [42] Application of Machine Learning in the Prediction of Hypothyreoidism
    Helac, Hanna
    Kamenjas, Edina
    Hodzic, Nejira
    MEDICON 2023 AND CMBEBIH 2023, VOL 2, 2024, 94 : 756 - 761
  • [43] Machine learning and deep learning methods for wireless network applications
    Abel C. H. Chen
    Wen-Kang Jia
    Feng-Jang Hwang
    Genggeng Liu
    Fangying Song
    Lianrong Pu
    EURASIP Journal on Wireless Communications and Networking, 2022
  • [44] Application of Machine Learning to the Prediction of WBGT
    Lu, Chang
    Yun, Yeboon
    Yoon, Min
    2021 60TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2021, : 3 - 8
  • [45] Machine learning and deep learning methods for wireless network applications
    Chen, Abel C. H.
    Jia, Wen-Kang
    Hwang, Feng-Jang
    Liu, Genggeng
    Song, Fangying
    Pu, Lianrong
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2022, 2022 (01)
  • [46] Machine learning methods applied to drilling rate of penetration prediction and optimization - A review
    Barbosa, Luis Felipe F. M.
    Nascimento, Andreas
    Mathias, Mauro Hugo
    de Carvalho, Joao Andrade, Jr.
    JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2019, 183
  • [47] Machine Learning Methods for Neonatal Heart Rate Prediction using Respiratory Signals
    Yusran, Maharaj Faawwaz A.
    Azzman, Tengku Ahmad Naim Tengku Mohd
    Saw, Shier Nee
    Hasan, Zati Hakim Azizul
    2023 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP, SSP, 2023, : 334 - 338
  • [48] Prediction of Pipe Failure Rate in Heating Networks Using Machine Learning Methods
    Beloev, Hristo Ivanov
    Saitov, Stanislav Radikovich
    Filimonova, Antonina Andreevna
    Chichirova, Natalia Dmitrievna
    Babikov, Oleg Evgenievich
    Iliev, Iliya Krastev
    ENERGIES, 2024, 17 (14)
  • [49] Application of Machine Learning Methods in Bioinformatics
    Yang, Haoyu
    An, Zheng
    Zhou, Haotian
    Hou, Yawen
    6TH INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN, MANUFACTURING, MODELING AND SIMULATION (CDMMS 2018), 2018, 1967
  • [50] APPLICATION OF MACHINE LEARNING METHODS IN BIOINFORMATICS
    Wu, S. F.
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2016, 118 : 35 - 35