Learning conditional policies for crystal design using offline reinforcement learning

被引:0
|
作者
Govindarajan, Prashant [1 ]
Miret, Santiago [2 ]
Rector-Brooks, Jarrid [3 ]
Phielipp, Mariano [2 ]
Rajendran, Janarthanan [3 ]
Chandar, Sarath [1 ]
机构
[1] Mila Quebec AI Inst, Polytech, Montreal, PQ, Canada
[2] Intel Labs, Hillsboro, OR USA
[3] Univ Montreal, Mila Quebec AI Inst, Montreal, PQ, Canada
来源
DIGITAL DISCOVERY | 2024年 / 3卷 / 04期
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1039/d4dd00024b
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Navigating through the exponentially large chemical space to search for desirable materials is an extremely challenging task in material discovery. Recent developments in generative and geometric deep learning have shown promising results in molecule and material discovery but often lack evaluation with high-accuracy computational methods. This work aims to design novel and stable crystalline materials conditioned on a desired band gap. To achieve conditional generation, we: (1) formulate crystal design as a sequential decision-making problem, create relevant trajectories based on high-quality materials data, and use conservative Q-learning to learn a conditional policy from these trajectories. To do so, we formulate a reward function that incorporates constraints for energetic and electronic properties obtained directly from density functional theory (DFT) calculations; (2) evaluate the generated materials from the policy using DFT calculations for both energy and band gap; (3) compare our results to relevant baselines, including behavioral cloning and unconditioned policy learning. Our experiments show that conditioned policies achieve targeted crystal design and demonstrate the capability to perform crystal discovery evaluated with accurate and computationally expensive DFT calculations. Conservative Q-learning for band-gap conditioned crystal design with DFT evaluations - the model is trained on trajectories constructed from crystals in the Materials Project. Results indicate promising performance for lower band gap targets.
引用
收藏
页码:769 / 785
页数:17
相关论文
共 50 条
  • [1] Efficient Diffusion Policies for Offline Reinforcement Learning
    Kang, Bingyi
    Ma, Xiao
    Du, Chao
    Pang, Tianyu
    Yan, Shuicheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [2] Safe Offline Reinforcement Learning Through Hierarchical Policies
    Liu, Shaofan
    Sun, Shiliang
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2022, PT II, 2022, 13281 : 380 - 391
  • [3] Offline Reinforcement Learning with Pseudometric Learning
    Dadashi, Robert
    Rezaeifar, Shideh
    Vieillard, Nino
    Hussenot, Leonard
    Pietquin, Olivier
    Geist, Matthieu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [4] Robust Reinforcement Learning using Offline Data
    Panaganti, Kishan
    Xu, Zaiyan
    Kalathil, Dileep
    Ghavamzadeh, Mohammad
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [5] Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning
    Ada, Suzan Ece
    Oztop, Erhan
    Ugur, Emre
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (04) : 3116 - 3123
  • [6] Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data
    Zhang, Ruiqi
    Zanette, Andrea
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] Learning Behavior of Offline Reinforcement Learning Agents
    Shukla, Indu
    Dozier, Haley. R.
    Henslee, Althea. C.
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS VI, 2024, 13051
  • [8] Benchmarking Offline Reinforcement Learning
    Tittaferrante, Andrew
    Yassine, Abdulsalam
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 259 - 263
  • [9] Federated Offline Reinforcement Learning
    Zhou, Doudou
    Zhang, Yufeng
    Sonabend-W, Aaron
    Wang, Zhaoran
    Lu, Junwei
    Cai, Tianxi
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (548) : 3152 - 3163
  • [10] Distributed Offline Reinforcement Learning
    Heredia, Paulo
    George, Jemin
    Mou, Shaoshuai
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 4621 - 4626