Learning conditional policies for crystal design using offline reinforcement learning

被引:0
|
作者
Govindarajan, Prashant [1 ]
Miret, Santiago [2 ]
Rector-Brooks, Jarrid [3 ]
Phielipp, Mariano [2 ]
Rajendran, Janarthanan [3 ]
Chandar, Sarath [1 ]
机构
[1] Mila Quebec AI Inst, Polytech, Montreal, PQ, Canada
[2] Intel Labs, Hillsboro, OR USA
[3] Univ Montreal, Mila Quebec AI Inst, Montreal, PQ, Canada
来源
DIGITAL DISCOVERY | 2024年 / 3卷 / 04期
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1039/d4dd00024b
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Navigating through the exponentially large chemical space to search for desirable materials is an extremely challenging task in material discovery. Recent developments in generative and geometric deep learning have shown promising results in molecule and material discovery but often lack evaluation with high-accuracy computational methods. This work aims to design novel and stable crystalline materials conditioned on a desired band gap. To achieve conditional generation, we: (1) formulate crystal design as a sequential decision-making problem, create relevant trajectories based on high-quality materials data, and use conservative Q-learning to learn a conditional policy from these trajectories. To do so, we formulate a reward function that incorporates constraints for energetic and electronic properties obtained directly from density functional theory (DFT) calculations; (2) evaluate the generated materials from the policy using DFT calculations for both energy and band gap; (3) compare our results to relevant baselines, including behavioral cloning and unconditioned policy learning. Our experiments show that conditioned policies achieve targeted crystal design and demonstrate the capability to perform crystal discovery evaluated with accurate and computationally expensive DFT calculations. Conservative Q-learning for band-gap conditioned crystal design with DFT evaluations - the model is trained on trajectories constructed from crystals in the Materials Project. Results indicate promising performance for lower band gap targets.
引用
收藏
页码:769 / 785
页数:17
相关论文
共 50 条
  • [31] Offline Reinforcement Learning for Visual Navigation
    Shah, Dhruv
    Bhorkar, Arjun
    Leen, Hrish
    Kostrikov, Ilya
    Rhinehart, Nick
    Levine, Sergey
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 44 - 54
  • [32] Hyperparameter Tuning in Offline Reinforcement Learning
    Tittaferrante, Andrew
    Yassine, Abdulsalam
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 585 - 590
  • [33] Offline reinforcement learning with task hierarchies
    Devin Schwab
    Soumya Ray
    Machine Learning, 2017, 106 : 1569 - 1598
  • [34] Offline Reinforcement Learning for Mobile Notifications
    Yuan, Yiping
    Muralidharan, Ajith
    Nandy, Preetam
    Cheng, Miao
    Prabhakar, Prakruthi
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 3614 - 3623
  • [35] A DATASET PERSPECTIVE ON OFFLINE REINFORCEMENT LEARNING
    Schweighofer, Kajetan
    Radler, Andreas
    Dinu, Marius-Constantin
    Hofmarcher, Markus
    Patil, Vihang
    Bitto-Nemling, Angela
    Eghbal-zadeh, Hamid
    Hochreiter, Sepp
    CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 199, 2022, 199
  • [36] Warfarin Dose Management Using Offline Deep Reinforcement Learning
    Ji, Hannah
    Gill, Matthew F.
    Draper, Evan W.
    Liedl, David A.
    Hodge, David O.
    Houghton, Damon E.
    Casanegra, Ana I.
    CIRCULATION, 2023, 148
  • [37] Doubly constrained offline reinforcement learning for learning path recommendation
    Yun, Yue
    Dai, Huan
    An, Rui
    Zhang, Yupei
    Shang, Xuequn
    KNOWLEDGE-BASED SYSTEMS, 2024, 284
  • [38] Adaptable Conservative Q-Learning for Offline Reinforcement Learning
    Qiu, Lyn
    Li, Xu
    Liang, Lenghan
    Sun, Mingming
    Yan, Junchi
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT III, 2024, 14427 : 200 - 212
  • [39] Doubly constrained offline reinforcement learning for learning path recommendation
    Yun, Yue
    Dai, Huan
    An, Rui
    Zhang, Yupei
    Shang, Xuequn
    Knowledge-Based Systems, 2024, 284
  • [40] Learning Personalized Health Recommendations via Offline Reinforcement Learning
    Preuett, Larry
    PROCEEDINGS OF THE EIGHTEENTH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2024, 2024, : 1355 - 1357