external validation of QSAR models;
classification and category response variables;
p-values and correct classification rate as alternative criteria of prediction accuracy;
BINARY QSAR;
COMBINATORIAL QSAR;
CLASSIFICATION;
VALIDATION;
CHEMINFORMATICS;
PREDICTION;
DATASETS;
BINDERS;
BINDING;
VERIFY;
D O I:
10.1002/minf.201800152
中图分类号:
R914 [药物化学];
学科分类号:
100701 ;
摘要:
The goal of this manuscript is to discuss important aspects of external validation of classification and category Quantitative Structure - Activity/Property/Toxicity Relationship QS/A/P/T/R models that to the best of author's knowledge are not addressed in publications. Statistical significance (in terms of p-value) and accuracy of prediction (in terms of Correct Classification Rate (CCR)) of external validation set compounds are among most important characteristics of the models. We assert that in most cases the models built for classification or category response variable should be statistically significant and predictive for each class or category. We show that three thresholds of the number of compounds in each class or category of the external validation sets should be satisfied. 1) The p-value criterion can never be satisfied, if the number of compounds is below the first threshold. 2) If the number of compounds is between the first and the second thresholds, p-value criterion should be used. 3) If it is higher than the third threshold, classification or category accuracy criterion should be used. 4) If the number of compounds is between second and third thresholds, either one or the other criterion should be used depending on the value of p-value. 5) When the number of compounds in the class approaches infinity, the maximum relative error of prediction approaches the relative expected error. The results are of interest in other areas of multidimensional data analysis.
机构:
Imperial Coll London, Dept Med, Haematol Res Ctr, Div Expt Med, London, EnglandImperial Coll London, Dept Med, Haematol Res Ctr, Div Expt Med, London, England
Gale, R. P.
Hochhaus, A.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Klinikum, Abt Hematol Internist Onkol, Jena, GermanyImperial Coll London, Dept Med, Haematol Res Ctr, Div Expt Med, London, England
Hochhaus, A.
Zhang, M-J
论文数: 0引用数: 0
h-index: 0
机构:
Med Coll Wisconsin, Div Biostat, Milwaukee, WI 53226 USA
Med Coll Wisconsin, Ctr Int Bone Marrow Transplant Res, Milwaukee, WI 53226 USAImperial Coll London, Dept Med, Haematol Res Ctr, Div Expt Med, London, England