Evaluation and Optimization Methods for Applicability Domain Methods and Their Hyperparameters, Considering the Prediction Performance of Machine Learning Models

被引:3
|
作者
Kaneko, Hiromasa [1 ]
机构
[1] Meiji Univ, Sch Sci & Technol, Dept Appl Chem, Kawasaki, Kanagawa 2148571, Japan
来源
ACS OMEGA | 2024年 / 9卷 / 10期
关键词
CRITICAL-TEMPERATURE; DATA SET; QSAR; POINT;
D O I
10.1021/acsomega.3c08036
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In molecular, material, and process design and control, the applicability domain (AD) of a mathematical model y = f(x) between properties, activities, and features x is constructed. As there are multiple AD methods, each with its own set of hyperparameters, it is necessary to select an appropriate AD method and hyperparameters for each data set and mathematical model. However, there is no method for optimizing the AD model. This study proposes a method for evaluating and optimizing the AD model for each data set and a mathematical model. Using the predictions of double cross-validation with all samples, the relationship between coverage and root-mean-squared error (RMSE) was calculated for all combinations of AD methods and their hyperparameters, and the area under the coverage and RMSE curve (AUCR) was calculated. The AD model with the lowest AUCR value was selected as the optimal fit for the mathematical model. The proposed method was validated using eight data sets, including molecules, materials, and spectra, demonstrating that the proposed method could generate optimal AD models for all data sets. The Python code for the proposed method is available at https://github.com/hkaneko1985/dcekit.
引用
收藏
页码:11453 / 11458
页数:6
相关论文
共 50 条
  • [21] Performance evaluation of machine learning methods for path loss prediction in rural environment at 3.7 GHz
    Nektarios Moraitis
    Lefteris Tsipi
    Demosthenes Vouyioukas
    Angelina Gkioni
    Spyridon Louvros
    Wireless Networks, 2021, 27 : 4169 - 4188
  • [22] Severity Prediction with Machine Learning Methods
    Geyik, Buket
    Kara, Medine
    2ND INTERNATIONAL CONGRESS ON HUMAN-COMPUTER INTERACTION, OPTIMIZATION AND ROBOTIC APPLICATIONS (HORA 2020), 2020, : 382 - 388
  • [23] Monthly streamflow prediction and performance comparison of machine learning and deep learning methods
    Ayana, Omer
    Kanbak, Deniz Furkan
    Keles, Muemine Kaya
    Turhan, Evren
    ACTA GEOPHYSICA, 2023, 71 (06) : 2905 - 2922
  • [24] Monthly streamflow prediction and performance comparison of machine learning and deep learning methods
    Ömer Ayana
    Deniz Furkan Kanbak
    Mümine Kaya Keleş
    Evren Turhan
    Acta Geophysica, 2023, 71 : 2905 - 2922
  • [25] Reservoir optimization and machine learning methods
    Warin, Xavier
    EURO JOURNAL ON COMPUTATIONAL OPTIMIZATION, 2023, 11
  • [26] Evaluation of re-sampling methods on performance of machine learning models to predict landslide susceptibility
    Hassangavyar, Moslem Borji
    Damaneh, Hadi Eskandari
    Pham, Quoc Bao
    Linh, Nguyen Thi Thuy
    Tiefenbacher, John
    Bach, Quang-Vu
    GEOCARTO INTERNATIONAL, 2022, 37 (10) : 2772 - 2794
  • [27] A general approach for determining applicability domain of machine learning models
    Lane E. Schultz
    Yiqi Wang
    Ryan Jacobs
    Dane Morgan
    npj Computational Materials, 11 (1)
  • [28] Performance Of Soil Prediction Using Machine Learning For Data Clustering Methods
    Rajeshwari, M.
    Shunmuganathan, N.
    Sankarasubramanian, R.
    JOURNAL OF ALGEBRAIC STATISTICS, 2022, 13 (02) : 825 - 831
  • [29] Evaluation and Prediction of Pavement Deflection Parameters Based on Machine Learning Methods
    Chen, Xueqin
    Dong, Qiao
    Dong, Shi
    BUILDINGS, 2022, 12 (11)
  • [30] The performance comparison of machine learning methods for solar PV power prediction
    Demir, Funda
    WORLD JOURNAL OF ENGINEERING, 2024,