Development of models predicting biodegradation rate rating with multiple linear regression and support vector machine algorithms

被引:45
|
作者
Tang, Weihao [1 ]
Li, Yanying [1 ]
Yu, Yang [2 ]
Wang, Zhongyu [1 ]
Xu, Tong [1 ]
Chen, Jingwen [1 ]
Lin, Jun [2 ]
Li, Xuehua [1 ]
机构
[1] Dalian Univ Technol, Sch Environm Sci & Technol, Key Lab Ind Ecol & Environm Engn MOE, Dalian 116024, Peoples R China
[2] Minist Ecol & Environm MEE, Solid Waste & Chem Management Ctr, Beijing 100029, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Biodegradability; Quantitative structure-activity relationship; Multiple linear regression; Support vector machine; Molecular structure descriptors; AEROBIC BIODEGRADATION; READY BIODEGRADABILITY; BIOACCUMULATIVE ORGANICS; CHEMICALS; PERSISTENT; QSAR; POLLUTANTS;
D O I
10.1016/j.chemosphere.2020.126666
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Biodegradation is a significant process for removing organic chemicals from water, soil and sediment environments, and therefore biodegradability is critical to evaluate the environmental persistence of organic chemicals. In this study, based on a dataset with 171 compounds, four quantitative structure-activity relationship (QSAR) models were developed for predicting primary and ultimate biodegradation rate rating with multiple linear regression (MLR) and support vector machine (SVM) algorithms. Two MLR models were built with a dataset with carbon atom number <= 9, and two SVM models were built with a dataset with carbon atom number >9. In the MLR models, n(ArX) (number of X on aromatic ring) is the most important descriptor governing primary and ultimate biodegradation of organic chemicals. For the SVM models, determination coefficient (R-2) values, cross-validated coefficients (Q(LOO)(2)) and external validation coefficient (Q(ext)(2)) values are over 0.9, indicating the SVM models have satisfactory goodness-of-fit, robustness and external predictive abilities. The applicability domains of these models were visualized by the Williams plot. The developed models can be used as effective tools to predict biodegradability of organic chemicals. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] CONSTRUCTION COSTS FORECASTING: COMPARISON OF THE ACCURACY OF LINEAR REGRESSION AND SUPPORT VECTOR MACHINE MODELS
    Petruseva, Silvana
    Zileska-Pancovska, Valentina
    Zujo, Vahida
    Brkan-Vejzovic, Aida
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2017, 24 (05): : 1431 - 1438
  • [2] The cutting force predication based on integration of multiple linear regression and support vector machine
    Hu, Yanjuan
    Wang, Zhanli
    Dong, Chao
    Wang, Yao
    Journal of Information and Computational Science, 2014, 11 (05): : 1687 - 1697
  • [3] Optimized Ensemble Support Vector Regression Models for Predicting Stock Prices with Multiple Kernels
    Thumu, Subba Reddy
    Nellore, Geethanjali
    ACTA INFORMATICA PRAGENSIA, 2024, 13 (01) : 24 - 37
  • [4] Machine Learning Algorithms Outperform Conventional Regression Models in Predicting Development of Hepatocellular Carcinoma
    Singal, Amit G.
    Mukherjee, Ashin
    Elmunzer, B. Joseph
    Higgins, Peter D. R.
    Lok, Anna S.
    Zhu, Ji
    Marrero, Jorge A.
    Waljee, Akbar K.
    AMERICAN JOURNAL OF GASTROENTEROLOGY, 2013, 108 (11): : 1723 - 1730
  • [5] Is Support Vector Regression method suitable for predicting rate of penetration?
    Kor, Korhan
    Altun, Gursat
    JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2020, 194 (194)
  • [6] Model Structure Learning: A Support Vector Machine Approach for LPV Linear-Regression Models
    Toth, Roland
    Laurain, Vincent
    Zheng, Wei Xing
    Poolla, Kameshwar
    2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 3192 - 3197
  • [7] Comparing Support Vector Regression and Statistical Linear Regression for Predicting Poverty Incidence in Vietnam
    Senf, Cornelius
    Lakes, Tobia
    BRIDGING THE GEOGRAPHIC INFORMATION SCIENCES, 2012, : 251 - 265
  • [8] Development of Multiple Linear Regression Models for Predicting Chronic Iron Toxicity to Aquatic Organisms
    Brix, Kevin V.
    Tear, Lucinda
    DeForest, David K.
    Adams, William J.
    ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY, 2023, 42 (06) : 1386 - 1400
  • [9] Prediction of ionic conductivity of imidazolium-based ionic liquids at different temperatures using multiple linear regression and support vector machine algorithms
    Koi, Zi Kang
    Yahya, Wan Zaireen Nisa
    Kurnia, Kiki Adi
    NEW JOURNAL OF CHEMISTRY, 2021, 45 (39) : 18584 - 18597
  • [10] Development and Implementation of Support Vector Machine Regression Surrogate Models for Predicting Groundwater Pumping-Induced Saltwater Intrusion into Coastal Aquifers
    Alvin Lal
    Bithin Datta
    Water Resources Management, 2018, 32 : 2405 - 2419