Optimal design of experiments in the context of machine-learning inter-atomic potentials: improving the efficiency and transferability of kernel based methods

被引：0

作者：

Barzdajn, Bartosz ^{[1
]}

Race, Christopher ^{[2
]}

机构：

[1] Univ Manchester, Oxford Rd, Manchester M139PL, England

[2] Univ Sheffield, Western Bank, Sheffield S102TN, England

来源：

MODELLING AND SIMULATION IN MATERIALS SCIENCE AND ENGINEERING | 2025年 / 33卷 / 02期

基金：

英国工程与自然科学研究理事会;

关键词：

interatomic potentials; machine learning; optimal desing; material science; GAP; TOTAL-ENERGY CALCULATIONS;

D O I：

10.1088/1361-651X/ada050

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Data-driven machine learning (ML) models of atomistic interactions are often based on flexible and non-physical functions that can relate nuanced aspects of atomic arrangements to predictions of energies and forces. As a result, these potentials are only as good as the training data (usually the results of so-called ab initio simulations), and we need to ensure that we have enough information to make a model sufficiently accurate, reliable and transferable. The main challenge stems from the fact that descriptors of chemical environments are often sparse, high-dimensional objects without a well-defined continuous metric. Therefore, it is rather unlikely that any ad hoc method for selecting training examples will be indiscriminate, and it is easy to fall into the trap of confirmation bias, where the same narrow and biased sampling is used to generate training and test sets. We will show that an approach derived from classical concepts of statistical planning of experiments and optimal design can help to mitigate such problems at a relatively low computational cost. The key feature of the method we will investigate is that it allows us to assess the quality of the data without obtaining reference energies and forces-a so-called offline approach. In other words, we are focusing on an approach that is easy to implement and does not require sophisticated frameworks that involve automated access to high performance computing.

引用

页数：20

共 7 条

[1] Model-Based Design of Experiments for High-Dimensional Inputs Supported by Machine-Learning Methods
Seufert, Philipp
Schwientek, Jan
Bortz, Michael
PROCESSES, 2021, 9 (03) : 1 - 25
[2] Gold and Bitcoin Optimal Portfolio Research and Analysis Based on Machine-Learning Methods
Li, Jingjing
Rao, Xinge
Li, Xianyi
Guan, Sihai
SUSTAINABILITY, 2022, 14 (21)
[3] Evaluation of statistical climate reconstruction methods based on pseudoproxy experiments using linear and machine-learning methods
Zhang, Zeguo
Wagner, Sebastian
Klockmann, Marlene
Zorita, Eduardo
CLIMATE OF THE PAST, 2022, 18 (12) : 2643 - 2668
[4] A Real-Time Intelligent System Based on Machine-Learning Methods for Improving Communication in Sign Language
Leiva, Victor
Rahman, Muhammad Zia Ur
Akbar, Muhammad Azeem
Castro, Cecilia
Huerta, Mauricio
Riaz, Muhammad Tanveer
IEEE ACCESS, 2025, 13 : 22055 - 22073
[5] Improving Machine-learning Based Predictive Psychiatry via Subgroup-Sensitised Classification Methods: Results of the PRONIA Project
Pesonen, H.
Salokangas, R.
Borgwardt, S.
Ruhrmann, S.
Brambilla, P.
Wood, S.
Pantelis, C.
Meisenzhal, E.
Koutsouleris, N.
EARLY INTERVENTION IN PSYCHIATRY, 2016, 10 : 52 - 52
[6] Improving solar radiation estimation in China based on regional optimal combination of meteorological factors with machine learning methods
He, Chuan
Liu, Jiandong
Xu, Fang
Zhang, Teng
Chen, Shang
Sun, Zhe
Zheng, Wenhui
Wang, Runhong
He, Liang
Feng, Hao
Yu, Qiang
He, Jianqiang
ENERGY CONVERSION AND MANAGEMENT, 2020, 220 (220)
[7] Does using augmented reality in online shopping affect post-purchase product perceptions? A mixed design using machine-learning based sentiment analysis, lab experiments, and focus groups
Dong, Xuebing
Hu, Chuanzhi
Heller, Jonas
Deng, Nianqi
INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2025, 82

← 1 →