Multi-modal molecule structure-text model for text-based retrieval and editing

被引:26
|
作者
Liu, Shengchao [1 ,2 ]
Nie, Weili [3 ]
Wang, Chengpeng [4 ]
Lu, Jiarui [1 ,2 ]
Qiao, Zhuoran [5 ]
Liu, Ling [6 ]
Tang, Jian [1 ,7 ]
Xiao, Chaowei [3 ,8 ]
Anandkumar, Animashree [3 ,5 ]
机构
[1] Mila Quebec Artificial Intelligence Inst, Montreal, PQ, Canada
[2] Univ Montreal, Montreal, PQ, Canada
[3] NVIDIA Res, Santa Clara, CA 95051, Albania
[4] Univ Illinois, Champaign, IL USA
[5] CALTECH, Pasadena, CA 91125 USA
[6] Princeton Univ, Princeton, NJ USA
[7] HEC Montreal, Montreal, PQ, Canada
[8] Arizona State Univ, Tempe, AZ USA
关键词
DRUG; SIMILARITY; DISCOVERY; CHEMISTRY; AREA; ZINC;
D O I
10.1038/s42256-023-00759-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There is increasing adoption of artificial intelligence in drug discovery. However, existing studies use machine learning to mainly utilize the chemical structures of molecules but ignore the vast textual knowledge available in chemistry. Incorporating textual knowledge enables us to realize new drug design objectives, adapt to text-based instructions and predict complex biological activities. Here we present a multi-modal molecule structure-text model, MoleculeSTM, by jointly learning molecules' chemical structures and textual descriptions via a contrastive learning strategy. To train MoleculeSTM, we construct a large multi-modal dataset, namely, PubChemSTM, with over 280,000 chemical structure-text pairs. To demonstrate the effectiveness and utility of MoleculeSTM, we design two challenging zero-shot tasks based on text instructions, including structure-text retrieval and molecule editing. MoleculeSTM has two main properties: open vocabulary and compositionality via natural language. In experiments, MoleculeSTM obtains the state-of-the-art generalization ability to novel biochemical concepts across various benchmarks. Machine learning methods in cheminformatics have made great progress in using chemical structures of molecules, but a large portion of textual information remains scarcely explored. Liu and colleagues trained MoleculeSTM, a foundation model that aligns the structure and text modalities through contrastive learning, and show its utility on the downstream tasks of structure-text retrieval, text-guided editing and molecular property prediction.
引用
收藏
页码:1447 / 1457
页数:11
相关论文
共 50 条
  • [41] External query reformulation for text-based image retrieval
    Min, Jinming
    Jones, Gareth J. F.
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2011, 7024 LNCS : 249 - 260
  • [42] External Query Reformulation for Text-Based Image Retrieval
    Min, Jinming
    Jones, Gareth J. F.
    STRING PROCESSING AND INFORMATION RETRIEVAL, 2011, 7024 : 249 - 260
  • [43] Implementation and Comparison of Text-Based Image Retrieval Schemes
    Zaidi, Syed Ali Jafar
    Buriro, Attaullah
    Riaz, Mohammad
    Mahoob, Athar
    Riaz, Mohammad Noman
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (01) : 611 - 618
  • [44] Imagic: Text-Based Real Image Editing with Diffusion Models
    Kawar, Bahjat
    Zada, Shiran
    Lang, Oran
    Tov, Omer
    Chang, Huiwen
    Dekel, Tali
    Mosseri, Inbar
    Irani, Michal
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6007 - 6017
  • [45] DATENeRF: Depth-Aware Text-Based Editing of NeRFs
    Rojas, Sara
    Philip, Julien
    Zhang, Kai
    Bi, Sai
    Luan, Fujun
    Ghanem, Bernard
    Sunkavalli, Kalyan
    COMPUTER VISION - ECCV 2024, PT XI, 2025, 15069 : 267 - 284
  • [46] Multi-scale Multi-modal Dictionary BERT For Effective Text-image Retrieval in Multimedia Advertising
    Yu, Tan
    Liu, Jie
    Jin, Zhipeng
    Yang, Yi
    Fei, Hongliang
    Li, Ping
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4655 - 4660
  • [47] Sentiment Classification Algorithm Based on Multi-Modal Social Media Text Information
    Xuanyuan, Minzheng
    Xiao, Le
    Duan, Mengshi
    IEEE ACCESS, 2021, 9 : 33410 - 33418
  • [48] Bridging the gap: multi-granularity representation learning for text-based vehicle retrieval
    Bo, Xue
    Liu, Junjie
    Yang, Di
    Ma, Wentao
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (01)
  • [49] Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems
    Pereira, Jose Costa
    Vasconcelos, Nuno
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2014, 124 : 123 - 135
  • [50] Multi-modal graph reasoning for structured video text extraction
    Shi, Weitao
    Wang, Han
    Lou, Xin
    COMPUTERS & ELECTRICAL ENGINEERING, 2023, 107