Learning Invariant Molecular Representation in Latent Discrete Space

被引:0
|
作者
Zhuang, Xiang [1 ,2 ,3 ]
Zhang, Qiang [1 ,2 ,3 ]
Ding, Keyan [2 ]
Bian, Yatao [4 ]
Wang, Xiao [5 ]
Lv, Jingsong [6 ]
Chen, Hongyang [6 ]
Chen, Huajun [1 ,2 ,3 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
[2] ZJU Hangzhou Global Sci & Technol Innovat Ctr, Hangzhou, Peoples R China
[3] Zhejiang Univ Ant Grp Joint Lab Knowledge Graph, Hangzhou, Peoples R China
[4] Tencent AI Lab, Shenzhen, Peoples R China
[5] Beihang Univ, Sch Software, Beijing, Peoples R China
[6] Zhejiang Lab, Hangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
DESIGN;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Molecular representation learning lays the foundation for drug discovery. However, existing methods suffer from poor out-of-distribution (OOD) generalization, particularly when data for training and testing originate from different environments. To address this issue, we propose a new framework for learning molecular representations that exhibit invariance and robustness against distribution shifts. Specifically, we propose a strategy called "first-encoding-then-separation" to identify invariant molecule features in the latent space, which deviates from conventional practices. Prior to the separation step, we introduce a residual vector quantization module that mitigates the over-fitting to training data distributions while preserving the expressivity of encoders. Furthermore, we design a task-agnostic self-supervised learning objective to encourage precise invariance identification, which enables our method widely applicable to a variety of tasks, such as regression and multi-label classification. Extensive experiments on 18 real-world molecular datasets demonstrate that our model achieves stronger generalization against state-of-the-art baselines in the presence of various distribution shifts. Our code is available at https://github.com/HICAI-ZJU/iMoLD.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Invariant and Sufficient Supervised Representation Learning
    Zhu, Junyu
    Liao, Xu
    Li, Changshi
    Jiao, Yuling
    Liu, Jin
    Lu, Xiliang
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [22] Representation Learning Through Latent Canonicalizations
    Litany, Or
    Morcos, Ari
    Sridhar, Srinath
    Guibas, Leonidas
    Hoffman, Judy
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 645 - 654
  • [23] Time-invariant representation of discrete periodic systems
    Wright State Univ, Dayton, United States
    Automatica, 2 (267-272):
  • [24] Time-invariant representation of discrete periodic systems
    Misra, P
    AUTOMATICA, 1996, 32 (02) : 267 - 272
  • [25] Tetrahedrally invariant discrete variable representation basis on the sphere
    Cargo, M
    Littlejohn, RG
    JOURNAL OF CHEMICAL PHYSICS, 2002, 117 (01): : 59 - 66
  • [26] Latent Representation Learning for Geospatial Entities
    Jiann, Ween
    Lauw, Hady W.
    ACM TRANSACTIONS ON SPATIAL ALGORITHMS AND SYSTEMS, 2024, 10 (04)
  • [27] Molecular latent space simulators
    Sidky, Hythem
    Chen, Wei
    Ferguson, Andrew L.
    CHEMICAL SCIENCE, 2020, 11 (35) : 9459 - 9467
  • [28] Protein Binding Site Representation in Latent Space
    Lohmann, Frederieke
    Allenspach, Stephan
    Atz, Kenneth
    Schiebroek, Carl C. G.
    Hiss, Jan A.
    Schneider, Gisbert
    MOLECULAR INFORMATICS, 2024,
  • [29] Fingerprint Synthesis Via Latent Space Representation
    Attia, Mohamed
    Attia, MennattAllah H.
    Iskander, Julie
    Saleh, Khaled
    Nahavandi, Darius
    Abobakr, Ahmed
    Hossny, Mohammed
    Nahavandi, Saeid
    2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 1855 - 1861
  • [30] Multimodal sensor fusion in the latent representation space
    Piechocki, Robert J.
    Wang, Xiaoyang
    Bocus, Mohammud J.
    SCIENTIFIC REPORTS, 2023, 13 (01)