Learning Invariant Molecular Representation in Latent Discrete Space

被引:0
|
作者
Zhuang, Xiang [1 ,2 ,3 ]
Zhang, Qiang [1 ,2 ,3 ]
Ding, Keyan [2 ]
Bian, Yatao [4 ]
Wang, Xiao [5 ]
Lv, Jingsong [6 ]
Chen, Hongyang [6 ]
Chen, Huajun [1 ,2 ,3 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
[2] ZJU Hangzhou Global Sci & Technol Innovat Ctr, Hangzhou, Peoples R China
[3] Zhejiang Univ Ant Grp Joint Lab Knowledge Graph, Hangzhou, Peoples R China
[4] Tencent AI Lab, Shenzhen, Peoples R China
[5] Beihang Univ, Sch Software, Beijing, Peoples R China
[6] Zhejiang Lab, Hangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
DESIGN;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Molecular representation learning lays the foundation for drug discovery. However, existing methods suffer from poor out-of-distribution (OOD) generalization, particularly when data for training and testing originate from different environments. To address this issue, we propose a new framework for learning molecular representations that exhibit invariance and robustness against distribution shifts. Specifically, we propose a strategy called "first-encoding-then-separation" to identify invariant molecule features in the latent space, which deviates from conventional practices. Prior to the separation step, we introduce a residual vector quantization module that mitigates the over-fitting to training data distributions while preserving the expressivity of encoders. Furthermore, we design a task-agnostic self-supervised learning objective to encourage precise invariance identification, which enables our method widely applicable to a variety of tasks, such as regression and multi-label classification. Extensive experiments on 18 real-world molecular datasets demonstrate that our model achieves stronger generalization against state-of-the-art baselines in the presence of various distribution shifts. Our code is available at https://github.com/HICAI-ZJU/iMoLD.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Latent Tangent Space Representation for Normal Estimation
    Cao, Junjie
    Zhu, Hairui
    Bai, Yunpeng
    Zhou, Jun
    Pan, Jinshan
    Su, Zhixun
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2022, 69 (01) : 921 - 929
  • [32] Multimodal sensor fusion in the latent representation space
    Robert J. Piechocki
    Xiaoyang Wang
    Mohammud J. Bocus
    Scientific Reports, 13
  • [33] ROTATION-INVARIANT LATENT SEMANTIC REPRESENTATION LEARNING FOR OBJECT DETECTION IN VHR OPTICAL REMOTE SENSING IMAGES
    Yao, Xiwen
    Feng, Xiaoxu
    Cheng, Gong
    Han, Junwei
    Guo, Lei
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 1382 - 1385
  • [34] Pareto Invariant Representation Learning for Multimedia Recommendation
    Huang, Shanshan
    Li, Haoxuan
    Li, Qingsong
    Zheng, Chunyuan
    Liu, Li
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6410 - 6419
  • [35] Multimodal MR Synthesis via Modality-Invariant Latent Representation
    Chartsias, Agisilaos
    Joyce, Thomas
    Giuffrida, Mario Valerio
    Tsaftaris, Sotirios A.
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2018, 37 (03) : 803 - 814
  • [36] Scale-invariant representation of machine learning
    Lee, Sungyeop
    Jo, Junghyo
    PHYSICAL REVIEW E, 2022, 105 (04)
  • [37] Learning discriminative and invariant representation for fingerprint retrieval
    Dehua SONG
    Ruilin LI
    Fandong ZHANG
    Jufu FENG
    Science China(Information Sciences), 2019, 62 (01) : 220 - 222
  • [38] Variational Invariant Representation Learning for Multimodal Recommendation
    Yang, Wei
    Zhang, Haoran
    Zhang, Li
    PROCEEDINGS OF THE 2024 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2024, : 752 - 760
  • [39] Unbiased Recommendation Through Invariant Representation Learning
    Tang, Min
    Zou, Lixin
    Cui, Shujie
    Liang, Shiuan-ni
    Jin, Zhe
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-APPLIED DATA SCIENCE TRACK, PT X, ECML PKDD 2024, 2024, 14950 : 280 - 296
  • [40] Fundamental Limits and Tradeoffs in Invariant Representation Learning
    Zhao, Han
    Dan, Chen
    Aragam, Bryon
    Jaakkola, Tommi S.
    Gordon, Geoffrey J.
    Ravikumar, Pradeep
    Journal of Machine Learning Research, 2022, 23