Efficient Exploration of Chemical Compound Space Using Active Learning for Prediction of Thermodynamic Properties of Alkane Molecules

被引:1
|
作者
Xiang, Yan [1 ]
Tang, Yu-Hang [2 ,3 ]
Gong, Zheng [1 ]
Liu, Hongyi [1 ]
Wu, Liang [1 ]
Lin, Guang [4 ,5 ]
Sun, Huai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Chem & Chem Engn, Shanghai 200240, Peoples R China
[2] Lawrence Berkeley Natl Lab, Computat Res Div, Berkeley, CA 94720 USA
[3] Nvidia Corp, Santa Clara, CA 95051 USA
[4] Purdue Univ, Dept Math, W Lafayette, IN 47907 USA
[5] Purdue Univ, Sch Mech Engn, W Lafayette, IN 47907 USA
基金
中国国家自然科学基金;
关键词
ACCELERATED DISCOVERY; DENSITY;
D O I
10.1021/acs.jcim.3c01430
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
We introduce an exploratory active learning (AL) algorithm using Gaussian process regression and marginalized graph kernel (GPR-MGK) to sample chemical compound space (CCS) at minimal cost. Targeting 251,728 enumerated alkane molecules with 4-19 carbon atoms, we applied the AL algorithm to select a diverse and representative set of molecules and then conducted high-throughput molecular simulations on these selected molecules. To demonstrate the power of the AL algorithm, we built directed message-passing neural networks (D-MPNN) using simulation data as the training set to predict liquid densities, heat capacities, and vaporization enthalpies of the CCS. Validations show that D-MPNN models built on the smallest training set considered in this work, which consists of 313 molecules or 0.124% of the original CCS, predict the properties with R-2 > 0.99 against the computational data and R-2 > 0.94 against the experimental data. The advantage of the presented AL algorithm is that the predicted uncertainty of GPR depends on only the molecular structures, which renders it compatible with high-throughput data generation.
引用
收藏
页码:6515 / 6524
页数:10
相关论文
共 50 条
  • [1] Efficient Exploration of Chemical Space with Docking and Deep Learning
    Yang, Ying
    Yao, Kun
    Repasky, Matthew P.
    Leswing, Karl
    Abel, Robert
    Shoichet, Brian K.
    Jerome, Steven, V
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2021, 17 (11) : 7106 - 7119
  • [2] Efficient chemical space exploration and accelerated property predictions using machine learning
    Pilania, Ghanshyam
    Wang, Chenchen
    Ramprasad, R.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2013, 246
  • [3] Chemical Space Exploration with Active Learning and Alchemical Free Energies
    Khalak, Yuriy
    Tresadern, Gary
    Hahn, David F.
    de Groot, Bert L.
    Gapsys, Vytautas
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2022, : 6259 - 6270
  • [4] Machine learning of molecular electronic properties in chemical compound space
    Montavon, Gregoire
    Rupp, Matthias
    Gobre, Vivekanand
    Vazquez-Mayagoitia, Alvaro
    Hansen, Katja
    Tkatchenko, Alexandre
    Mueller, Klaus-Robert
    von Lilienfeld, O. Anatole
    NEW JOURNAL OF PHYSICS, 2013, 15
  • [5] A NOVEL ACTIVE OPTIMIZATION APPROACH FOR RAPID AND EFFICIENT DESIGN SPACE EXPLORATION USING ENSEMBLE MACHINE LEARNING
    Owoyele, Opeoluwa
    Pal, Pinaki
    PROCEEDINGS OF THE ASME INTERNAL COMBUSTION ENGINE FALL TECHNICAL CONFERENCE, 2019, 2020,
  • [6] A Novel Active Optimization Approach for Rapid and Efficient Design Space Exploration Using Ensemble Machine Learning
    Owoyele, Opeoluwa
    Pal, Pinaki
    JOURNAL OF ENERGY RESOURCES TECHNOLOGY-TRANSACTIONS OF THE ASME, 2021, 143 (03):
  • [7] Active Exploration Deep Reinforcement Learning for Continuous Action Space with Forward Prediction
    Zhao, Dongfang
    Huanshi, Xu
    Xun, Zhang
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
  • [8] Active Exploration Deep Reinforcement Learning for Continuous Action Space with Forward Prediction
    Dongfang Zhao
    Xu Huanshi
    Zhang Xun
    International Journal of Computational Intelligence Systems, 17
  • [9] Efficient system design space exploration using machine learning techniques
    Ozisikyilmaz, Berkin
    Memik, Gokhan
    Choudhary, Alok
    2008 45TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2, 2008, : 966 - 969
  • [10] Efficient exploration of reaction pathways using reaction databases and active learning
    Kuryla, Domantas
    Csanyi, Gabor
    van Duin, Adri C. T.
    Michaelides, Angelos
    JOURNAL OF CHEMICAL PHYSICS, 2025, 162 (11):