Application of Random Forest Approach to QSAR Prediction of Aquatic Toxicity

被引:125
|
作者
Polishchuk, Pavel G. [1 ]
Muratov, Eugene N. [1 ,2 ]
Artemenko, Anatoly G. [1 ]
Kolumbin, Oleg G. [3 ]
Muratov, Nail N. [4 ]
Kuz'min, Victor E. [1 ]
机构
[1] AV Bogatsky Phys Chem Inst NAS Ukraine, Lab Theoret Chem, UA-65080 Odessa, Ukraine
[2] Univ N Carolina, Sch Pharm, Lab Mol Modeling, Chapel Hill, NC 27599 USA
[3] Pridnestrovskij State Univ, Dept Chem, MD-3300 Tiraspol, Moldova
[4] Odessa Natl Polytech Univ, Dept Chem Technol, UA-65000 Odessa, Ukraine
关键词
QUANTITATIVE STRUCTURE; VARIABLE SELECTION; SIMPLEX REPRESENTATION; APPLICABILITY DOMAIN; MODELS; PLS; NITROAROMATICS; DERIVATIVES; TECHNOLOGY; REGRESSION;
D O I
10.1021/ci900203n
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
This work is devoted to the application of the random forest approach to QSAR analysis of aquatic toxicity of chemical compounds tested on Tetrahymena pyriformis. The simplex representation of the molecular structure approach implemented in HiT QSAR Software was used for descriptors generation on a two-dimensional level. Adequate models based on simplex descriptors and the RF statistical approach were obtained on a modeling set of 644 compounds. Model predictivity was validated on two external test sets of 339 and 110 compounds. The high impact of lipophilicity and polarizability of investigated compounds on toxicity was determined. It was shown that RF models were tolerant for insertion of irrelevant descriptors as well as for randomization of some part of toxicity values that were representing a "noise". The fast procedure of optimization of the number of trees in the random forest has been proposed. The discussed RF model had comparable or better statistical characteristics than the corresponding PLS or KNN models.
引用
收藏
页码:2481 / 2488
页数:8
相关论文
共 50 条
  • [41] Prediction of the baseline toxicity of non-polar narcotic chemical mixtures by QSAR approach
    Luan, Feng
    Xu, Xuan
    Liu, Huitao
    Dias Soeiro Cordeiro, Maria Natalia
    CHEMOSPHERE, 2013, 90 (06) : 1980 - 1986
  • [42] Incorporation of absorption and metabolism into liver toxicity prediction for phytochemicals: A tiered in silico QSAR approach
    Liu, Yitong
    FOOD AND CHEMICAL TOXICOLOGY, 2018, 118 : 409 - 415
  • [43] QSAR model as a random event: A case of rat toxicity
    Toropova, Alla P.
    Toropov, Andrey A.
    Benfenati, Emilio
    Leszczynska, Danuta
    Leszczynski, Jerzy
    BIOORGANIC & MEDICINAL CHEMISTRY, 2015, 23 (06) : 1223 - 1230
  • [44] Interpretation of QSAR Models Based on Random Forest Methods
    Kuz'min, Victor E.
    Polishchuk, Pavel G.
    Artemenko, Anatoly G.
    Andronati, Sergey A.
    MOLECULAR INFORMATICS, 2011, 30 (6-7) : 593 - 603
  • [45] Prediction of metal toxicity in aquatic organisms
    Wang Wen-Xiong
    CHINESE SCIENCE BULLETIN, 2013, 58 (02): : 194 - 202
  • [46] Prediction of metal toxicity in aquatic organisms
    WANG Wen-Xiong
    Chinese Science Bulletin, 2013, 58 (02) : 194 - 202
  • [47] Prediction of metal toxicity in aquatic organisms
    WANG Wen-Xiong
    Science Bulletin, 2013, (02) : 194 - 202
  • [48] A new hyperparameter to random forest: application of remote sensing in yield prediction
    Manafifard, Mehrtash
    EARTH SCIENCE INFORMATICS, 2024, 17 (01) : 63 - 73
  • [49] An improved random forest algorithm and its application to wind pressure prediction
    Li Lang
    Liang Tiancai
    Ai Shan
    Tang Xiangyan
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (08) : 4016 - 4032
  • [50] QSAR models for predicting in vivo aquatic toxicity of chlorinated alkanes to fish
    Zvinavashe, Elton
    van den Berg, Hans
    Soffers, Ans E. M. F.
    Vervoort, Jacques
    Freidig, Andreas
    Murk, Albertinka J.
    Rietjens, Ivonne M. C. M.
    CHEMICAL RESEARCH IN TOXICOLOGY, 2008, 21 (03) : 739 - 745