Application of Random Forest Approach to QSAR Prediction of Aquatic Toxicity

被引:125
|
作者
Polishchuk, Pavel G. [1 ]
Muratov, Eugene N. [1 ,2 ]
Artemenko, Anatoly G. [1 ]
Kolumbin, Oleg G. [3 ]
Muratov, Nail N. [4 ]
Kuz'min, Victor E. [1 ]
机构
[1] AV Bogatsky Phys Chem Inst NAS Ukraine, Lab Theoret Chem, UA-65080 Odessa, Ukraine
[2] Univ N Carolina, Sch Pharm, Lab Mol Modeling, Chapel Hill, NC 27599 USA
[3] Pridnestrovskij State Univ, Dept Chem, MD-3300 Tiraspol, Moldova
[4] Odessa Natl Polytech Univ, Dept Chem Technol, UA-65000 Odessa, Ukraine
关键词
QUANTITATIVE STRUCTURE; VARIABLE SELECTION; SIMPLEX REPRESENTATION; APPLICABILITY DOMAIN; MODELS; PLS; NITROAROMATICS; DERIVATIVES; TECHNOLOGY; REGRESSION;
D O I
10.1021/ci900203n
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
This work is devoted to the application of the random forest approach to QSAR analysis of aquatic toxicity of chemical compounds tested on Tetrahymena pyriformis. The simplex representation of the molecular structure approach implemented in HiT QSAR Software was used for descriptors generation on a two-dimensional level. Adequate models based on simplex descriptors and the RF statistical approach were obtained on a modeling set of 644 compounds. Model predictivity was validated on two external test sets of 339 and 110 compounds. The high impact of lipophilicity and polarizability of investigated compounds on toxicity was determined. It was shown that RF models were tolerant for insertion of irrelevant descriptors as well as for randomization of some part of toxicity values that were representing a "noise". The fast procedure of optimization of the number of trees in the random forest has been proposed. The discussed RF model had comparable or better statistical characteristics than the corresponding PLS or KNN models.
引用
收藏
页码:2481 / 2488
页数:8
相关论文
共 50 条
  • [31] Estimation of the Toxicity of Different Substituted Aromatic Compounds to the Aquatic Ciliate Tetrahymena pyriformis by QSAR Approach
    Luan, Feng
    Wang, Ting
    Tang, Lili
    Zhang, Shuang
    Natalia Dias Soeiro Cordeiro, M.
    MOLECULES, 2018, 23 (05):
  • [32] Prediction of cellular toxicity of halocarbons from computed chemodescriptors: A hierarchical QSAR approach
    Basak, SC
    Balasubramanian, K
    Gute, BD
    Mills, D
    Gorczynska, A
    Roszak, S
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (04): : 1103 - 1109
  • [33] Application of cross-validation strategies to avoid overestimation of performance of 2D-QSAR models for the prediction of aquatic toxicity of chemical mixtures
    Chatterjee, M.
    Roy, K.
    SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2022, 33 (06) : 463 - 484
  • [34] A Simple Approach to the Toxicity Prediction of Anilines and Phenols Towards Aquatic Organisms
    Jules Muhire
    Bao Qiong Li
    Hong Lin Zhai
    Sha Sha Li
    Jia Ying Mi
    Archives of Environmental Contamination and Toxicology, 2020, 78 : 545 - 554
  • [35] A Simple Approach to the Toxicity Prediction of Anilines and Phenols Towards Aquatic Organisms
    Muhire, Jules
    Li, Bao Qiong
    Zhai, Hong Lin
    Li, Sha Sha
    Mi, Jia Ying
    ARCHIVES OF ENVIRONMENTAL CONTAMINATION AND TOXICOLOGY, 2020, 78 (04) : 545 - 554
  • [36] Application of Random Forest Model in the Prediction of River Water Quality
    Venkateswarlu, Turuganti
    Anmala, Jagadeesh
    PROCEEDINGS OF SEVENTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, ICICT 2022, VOL 1, 2023, 447 : 525 - 535
  • [37] Comparison of QSAR and QSPR based aquatic toxicity for mixed surfactants
    Joshi, Vishal Y.
    Kadam, Mahesh M.
    Sawant, Manohar R.
    JOURNAL OF SURFACTANTS AND DETERGENTS, 2007, 10 (01) : 25 - 34
  • [38] Prediction of nanofluids viscosity using random forest (RF) approach
    Gholizadeh, Majid
    Jamei, Mehdi
    Ahmadianfar, Iman
    Pourrajab, Rashid
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2020, 201
  • [39] A New Approach for CNYX Prediction Based on SSA and Random Forest
    Lai, Lin
    2017 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2017), 2017, : 967 - 970
  • [40] A new approach for CNYX PREDICTION BASED on SSA and random forest
    Lai, Lin
    Proceedings - 2017 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2017, 2017, : 967 - 970