Material transformers: deep learning language models for generative materials design

被引:21
|
作者
Fu, Nihang [1 ]
Wei, Lai [1 ]
Song, Yuqi [1 ]
Li, Qinyang [1 ]
Xin, Rui [1 ]
Omee, Sadman Sadeed [1 ]
Dong, Rongzhi [1 ]
Siriwardane, Edirisuriya M. Dilanga [2 ]
Hu, Jianjun [1 ]
机构
[1] Univ South Carolina, Dept Comp Sci & Engn, Columbia, SC 29201 USA
[2] Univ Colombo, Dept Phys, Colombo 03, Sri Lanka
来源
基金
美国国家科学基金会;
关键词
deep learning; language models; generative design; materials discovery; transformer; TOTAL-ENERGY CALCULATIONS; WAVE;
D O I
10.1088/2632-2153/acadcd
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained transformer language models (LMs) on large unlabeled corpus have produced state-of-the-art results in natural language processing, organic molecule design, and protein sequence generation. However, no such models have been applied to learn the composition patterns for the generative design of material compositions. Here we train a series of seven modern transformer models (GPT, GPT-2, GPT-Neo, GPT-J, BLMM, BART, and RoBERTa) for materials design using the expanded formulas of the ICSD, OQMD, and Materials Projects databases. Six different datasets with/out non-charge-neutral or EB samples are used to benchmark the generative design performances and uncover the biases of modern transformer models for the generative design of materials compositions. Our experiments show that the materials transformers based on causal LMs can generate chemically valid material compositions with as high as 97.61% to be charge neutral and 91.22% to be electronegativity balanced, which has more than six times higher enrichment compared to the baseline pseudo-random sampling algorithm. Our LMs also demonstrate high generation novelty and their potential in new materials discovery is proved by their capability to recover the leave-out materials. We also find that the properties of the generated compositions can be tailored by training the models with selected training sets such as high-bandgap samples. Our experiments also show that different models each have their own preference in terms of the properties of the generated samples and their running time complexity varies a lot. We have applied our materials transformers to discover a set of new materials as validated using density functional theory calculations.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Inverse design of semiconductor materials with deep generative models
    Qin, Chenglong
    Liu, Jinde
    Ma, Shiyin
    Du, Jiguang
    Jiang, Gang
    Zhao, Liang
    JOURNAL OF MATERIALS CHEMISTRY A, 2024, 12 (34) : 22689 - 22702
  • [2] Learning Deep Generative Models
    Salakhutdinov, Ruslan
    ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 2, 2015, 2 : 361 - 385
  • [3] Affibody Sequence Design Using Deep Learning Generative Models
    Wang, Zirui
    Mardikoraem, Mehrsa
    Woldring, Daniel
    PROTEIN SCIENCE, 2023, 32
  • [4] Generative Design of Inorganic Compounds Using Deep Diffusion Language Models
    Dong, Rongzhi
    Fu, Nihang
    Siriwardane, Edirisuriya M. D.
    Hu, Jianjun
    JOURNAL OF PHYSICAL CHEMISTRY A, 2024, 128 (29): : 5980 - 5989
  • [5] Inverse design with deep generative models: next step in materials discovery
    Lu, Shuaihua
    Zhou, Qionghua
    Chen, Xinyu
    Song, Zhilong
    Wang, Jinlan
    NATIONAL SCIENCE REVIEW, 2022, 9 (08)
  • [6] Inverse design of nanoporous crystalline reticular materials with deep generative models
    Yao, Zhenpeng
    Sanchez-Lengeling, Benjamin
    Bobbitt, N. Scott
    Bucior, Benjamin J.
    Kumar, Sai Govind Hari
    Collins, Sean P.
    Burns, Thomas
    Woo, Tom K.
    Farha, Omar K.
    Snurr, Randall Q.
    Aspuru-Guzik, Alan
    NATURE MACHINE INTELLIGENCE, 2021, 3 (01) : 76 - 86
  • [7] Inverse design with deep generative models: next step in materials discovery
    Shuaihua Lu
    Qionghua Zhou
    Xinyu Chen
    Zhilong Song
    Jinlan Wang
    NationalScienceReview, 2022, 9 (08) : 15 - 17
  • [8] Deep generative models for peptide design
    Wan, Fangping
    Kontogiorgos-Heintz, Daphne
    de la Fuente-Nunez, Cesar
    DIGITAL DISCOVERY, 2022, 1 (03): : 195 - 208
  • [9] Deep Generative Models for Materials Discovery and Machine Learning-Accelerated Innovation
    Fuhr, Addis S.
    Sumpter, Bobby G.
    FRONTIERS IN MATERIALS, 2022, 9
  • [10] Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale
    Hu, Xiang
    Ji, Pengyu
    Zhu, Qingyang
    Wu, Wei
    Tu, Kewei
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 2640 - 2657