Optimal design of experiments in the context of machine-learning inter-atomic potentials: improving the efficiency and transferability of kernel based methods

被引:0
|
作者
Barzdajn, Bartosz [1 ]
Race, Christopher [2 ]
机构
[1] Univ Manchester, Oxford Rd, Manchester M139PL, England
[2] Univ Sheffield, Western Bank, Sheffield S102TN, England
基金
英国工程与自然科学研究理事会;
关键词
interatomic potentials; machine learning; optimal desing; material science; GAP; TOTAL-ENERGY CALCULATIONS;
D O I
10.1088/1361-651X/ada050
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data-driven machine learning (ML) models of atomistic interactions are often based on flexible and non-physical functions that can relate nuanced aspects of atomic arrangements to predictions of energies and forces. As a result, these potentials are only as good as the training data (usually the results of so-called ab initio simulations), and we need to ensure that we have enough information to make a model sufficiently accurate, reliable and transferable. The main challenge stems from the fact that descriptors of chemical environments are often sparse, high-dimensional objects without a well-defined continuous metric. Therefore, it is rather unlikely that any ad hoc method for selecting training examples will be indiscriminate, and it is easy to fall into the trap of confirmation bias, where the same narrow and biased sampling is used to generate training and test sets. We will show that an approach derived from classical concepts of statistical planning of experiments and optimal design can help to mitigate such problems at a relatively low computational cost. The key feature of the method we will investigate is that it allows us to assess the quality of the data without obtaining reference energies and forces-a so-called offline approach. In other words, we are focusing on an approach that is easy to implement and does not require sophisticated frameworks that involve automated access to high performance computing.
引用
收藏
页数:20
相关论文
共 7 条
  • [1] Model-Based Design of Experiments for High-Dimensional Inputs Supported by Machine-Learning Methods
    Seufert, Philipp
    Schwientek, Jan
    Bortz, Michael
    PROCESSES, 2021, 9 (03) : 1 - 25
  • [2] Gold and Bitcoin Optimal Portfolio Research and Analysis Based on Machine-Learning Methods
    Li, Jingjing
    Rao, Xinge
    Li, Xianyi
    Guan, Sihai
    SUSTAINABILITY, 2022, 14 (21)
  • [3] Evaluation of statistical climate reconstruction methods based on pseudoproxy experiments using linear and machine-learning methods
    Zhang, Zeguo
    Wagner, Sebastian
    Klockmann, Marlene
    Zorita, Eduardo
    CLIMATE OF THE PAST, 2022, 18 (12) : 2643 - 2668
  • [4] A Real-Time Intelligent System Based on Machine-Learning Methods for Improving Communication in Sign Language
    Leiva, Victor
    Rahman, Muhammad Zia Ur
    Akbar, Muhammad Azeem
    Castro, Cecilia
    Huerta, Mauricio
    Riaz, Muhammad Tanveer
    IEEE ACCESS, 2025, 13 : 22055 - 22073
  • [5] Improving Machine-learning Based Predictive Psychiatry via Subgroup-Sensitised Classification Methods: Results of the PRONIA Project
    Pesonen, H.
    Salokangas, R.
    Borgwardt, S.
    Ruhrmann, S.
    Brambilla, P.
    Wood, S.
    Pantelis, C.
    Meisenzhal, E.
    Koutsouleris, N.
    EARLY INTERVENTION IN PSYCHIATRY, 2016, 10 : 52 - 52
  • [6] Improving solar radiation estimation in China based on regional optimal combination of meteorological factors with machine learning methods
    He, Chuan
    Liu, Jiandong
    Xu, Fang
    Zhang, Teng
    Chen, Shang
    Sun, Zhe
    Zheng, Wenhui
    Wang, Runhong
    He, Liang
    Feng, Hao
    Yu, Qiang
    He, Jianqiang
    ENERGY CONVERSION AND MANAGEMENT, 2020, 220 (220)
  • [7] Does using augmented reality in online shopping affect post-purchase product perceptions? A mixed design using machine-learning based sentiment analysis, lab experiments, and focus groups
    Dong, Xuebing
    Hu, Chuanzhi
    Heller, Jonas
    Deng, Nianqi
    INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2025, 82