Learning conditional policies for crystal design using offline reinforcement learning

被引：0

作者：

Govindarajan, Prashant ^{[1
]}

Miret, Santiago ^{[2
]}

Rector-Brooks, Jarrid ^{[3
]}

Phielipp, Mariano ^{[2
]}

Rajendran, Janarthanan ^{[3
]}

Chandar, Sarath ^{[1
]}

机构：

[1] Mila Quebec AI Inst, Polytech, Montreal, PQ, Canada

[2] Intel Labs, Hillsboro, OR USA

[3] Univ Montreal, Mila Quebec AI Inst, Montreal, PQ, Canada

来源：

DIGITAL DISCOVERY | 2024年 / 3卷 / 04期

基金：

加拿大自然科学与工程研究理事会;

关键词：

D O I：

10.1039/d4dd00024b

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Navigating through the exponentially large chemical space to search for desirable materials is an extremely challenging task in material discovery. Recent developments in generative and geometric deep learning have shown promising results in molecule and material discovery but often lack evaluation with high-accuracy computational methods. This work aims to design novel and stable crystalline materials conditioned on a desired band gap. To achieve conditional generation, we: (1) formulate crystal design as a sequential decision-making problem, create relevant trajectories based on high-quality materials data, and use conservative Q-learning to learn a conditional policy from these trajectories. To do so, we formulate a reward function that incorporates constraints for energetic and electronic properties obtained directly from density functional theory (DFT) calculations; (2) evaluate the generated materials from the policy using DFT calculations for both energy and band gap; (3) compare our results to relevant baselines, including behavioral cloning and unconditioned policy learning. Our experiments show that conditioned policies achieve targeted crystal design and demonstrate the capability to perform crystal discovery evaluated with accurate and computationally expensive DFT calculations. Conservative Q-learning for band-gap conditioned crystal design with DFT evaluations - the model is trained on trajectories constructed from crystals in the Materials Project. Results indicate promising performance for lower band gap targets.

引用

页码：769 / 785

页数：17

共 50 条

[31] Offline Reinforcement Learning for Visual Navigation
Shah, Dhruv
Bhorkar, Arjun
Leen, Hrish
Kostrikov, Ilya
Rhinehart, Nick
Levine, Sergey
CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 44 - 54
[32] Hyperparameter Tuning in Offline Reinforcement Learning
Tittaferrante, Andrew
Yassine, Abdulsalam
2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 585 - 590
[33] Offline reinforcement learning with task hierarchies
Devin Schwab
Soumya Ray
Machine Learning, 2017, 106 : 1569 - 1598
[34] Offline Reinforcement Learning for Mobile Notifications
Yuan, Yiping
Muralidharan, Ajith
Nandy, Preetam
Cheng, Miao
Prabhakar, Prakruthi
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 3614 - 3623
[35] A DATASET PERSPECTIVE ON OFFLINE REINFORCEMENT LEARNING
Schweighofer, Kajetan
Radler, Andreas
Dinu, Marius-Constantin
Hofmarcher, Markus
Patil, Vihang
Bitto-Nemling, Angela
Eghbal-zadeh, Hamid
Hochreiter, Sepp
CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 199, 2022, 199
[36] Warfarin Dose Management Using Offline Deep Reinforcement Learning
Ji, Hannah
Gill, Matthew F.
Draper, Evan W.
Liedl, David A.
Hodge, David O.
Houghton, Damon E.
Casanegra, Ana I.
CIRCULATION, 2023, 148
[37] Doubly constrained offline reinforcement learning for learning path recommendation
Yun, Yue
Dai, Huan
An, Rui
Zhang, Yupei
Shang, Xuequn
KNOWLEDGE-BASED SYSTEMS, 2024, 284
[38] Adaptable Conservative Q-Learning for Offline Reinforcement Learning
Qiu, Lyn
Li, Xu
Liang, Lenghan
Sun, Mingming
Yan, Junchi
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT III, 2024, 14427 : 200 - 212
[39] Doubly constrained offline reinforcement learning for learning path recommendation
Yun, Yue
Dai, Huan
An, Rui
Zhang, Yupei
Shang, Xuequn
Knowledge-Based Systems, 2024, 284
[40] Learning Personalized Health Recommendations via Offline Reinforcement Learning
Preuett, Larry
PROCEEDINGS OF THE EIGHTEENTH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2024, 2024, : 1355 - 1357

← 1 2 3 4 5 →