Comparing classification models-a practical tutorial

被引:6
|
作者
Walters, W. Patrick [1 ]
机构
[1] Relay Therapeut, 399 Binney St, Cambridge, MA 02141 USA
关键词
QSAR; Classification model; Statistical validation; Machine learning; Tutorial;
D O I
10.1007/s10822-021-00417-2
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
While machine learning models have become a mainstay in Cheminformatics, the field has yet to agree on standards for model evaluation and comparison. In many cases, authors compare methods by performing multiple folds of cross-validation and reporting the mean value for an evaluation metric such as the area under the receiver operating characteristic. These comparisons of mean values often lack statistical rigor and can lead to inaccurate conclusions. In the interest of encouraging best practices, this tutorial provides an example of how multiple methods can be compared in a statistically rigorous fashion.
引用
收藏
页码:381 / 389
页数:9
相关论文
共 50 条
  • [11] A practical guide to estimating the light extinction coefficient with nonlinear models-a case study on maize
    Lacasa, Josefina
    Hefley, Trevor J.
    Otegui, Maria E.
    Ciampitti, Ignacio A.
    PLANT METHODS, 2021, 17 (01)
  • [12] Solar radiation models-A review
    Ahmad, M. Jamil
    Tiwari, G. N.
    INTERNATIONAL JOURNAL OF ENERGY RESEARCH, 2011, 35 (04) : 271 - 290
  • [13] Crop Prediction Models-A Review
    Avadhani, Supreeth S.
    Arun, Aashrith B.
    Govinda, Varun
    Inamdar, Juyin Shafaq Imtiaz
    EMERGING TECHNOLOGIES IN DATA MINING AND INFORMATION SECURITY, IEMIS 2018, VOL 1, 2019, 755 : 13 - 17
  • [14] Industrial Production Models-A Theoretical Study
    Furubotn, Eirik G.
    AMERICAN ECONOMIC REVIEW, 1967, 57 (03): : 602 - 604
  • [15] Comparing classification models using expert knowledge
    Maimon, O
    Rokach, L
    Cohen, S
    6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL IV, PROCEEDINGS: MOBILE/WIRELESS COMPUTING AND COMMUNICATION SYSTEMS I, 2002, : 473 - 478
  • [16] Adaptive and Hybrid Forecasting Models-A Review
    Hernan Fajardo-Toro, Carlos
    Mula, Josefa
    Poler, Raul
    ENGINEERING DIGITAL TRANSFORMATION, 2019, : 315 - 322
  • [17] A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines
    Charte, David
    Charte, Francisco
    Garcia, Salvador
    del Jesus, Maria J.
    Herrera, Francisco
    INFORMATION FUSION, 2018, 44 : 78 - 96
  • [18] A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms
    Derrac, Joaquin
    Garcia, Salvador
    Molina, Daniel
    Herrera, Francisco
    SWARM AND EVOLUTIONARY COMPUTATION, 2011, 1 (01) : 3 - 18
  • [19] Tutorial: The Practical Application of Longitudinal Structural Equation Mediation Models in Clinical Trials
    Goldsmith, Kimberley A.
    MacKinnon, David P.
    Chalder, Trudie
    White, Peter D.
    Sharpe, Michael
    Pickles, Andrew
    PSYCHOLOGICAL METHODS, 2018, 23 (02) : 191 - 207
  • [20] Data preprocessing techniques: emergence and selection towards machine learning models-a practical review using HPA dataset
    Rao, K. Mallikharjuna
    Saikrishna, Ghanta
    Supriya, Kundrapu
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (24) : 37177 - 37196