Learning Substructure Invariance for Out-of-Distribution Molecular Representations

被引:0
|
作者
Yang, Nianzu [1 ]
Zeng, Kaipeng [1 ]
Wu, Qitian [1 ]
Jia, Xiaosong [1 ]
Yan, Junchi [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, MoE Key Lab Artificial Intelligence, Shanghai, Peoples R China
[2] Shanghai AI Lab, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
HIV-1; INTEGRASE; IDENTIFICATION; DRUGS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Molecule representation learning (MRL) has been extensively studied and current methods have shown promising power for various tasks, e.g., molecular property prediction and target identification. However, a common hypothesis of existing methods is that either the model development or experimental evaluation is mostly based on i.i.d. data across training and testing. Such a hypothesis can be violated in real-world applications where testing molecules could come from new environments, bringing about serious performance degradation or unexpected prediction. We propose a new representation learning framework entitled MoleOOD to enhance the robustness of MRL models against such distribution shifts, motivated by an observation that the (bio)chemical properties of molecules are usually invariantly associated with certain privileged molecular substructures across different environments (e.g., scaffolds, sizes, etc.). Specifically, We introduce an environment inference model to identify the latent factors that impact data generation from different distributions in a fully data-driven manner. We also propose a new learning objective to guide the molecule encoder to leverage environment-invariant substructures that more stably relate with the labels across environments. Extensive experiments on ten real-world datasets demonstrate that our model has a stronger generalization ability than existing methods under various out-of-distribution (OOD) settings, despite the absence of manual specifications of environments. Particularly, our method achieves up to 5.9% and 3.9% improvement over the strongest baselines on OGB and DrugOOD benchmarks in terms of ROC-AUC, respectively. Our source code is publicly available at https://github.com/yangnianzu0515/MoleOOD.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Graph out-of-distribution generalization through contrastive learning paradigm
    Du, Hongyi
    Li, Xuewei
    Shao, Minglai
    KNOWLEDGE-BASED SYSTEMS, 2025, 315
  • [42] Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning
    Ada, Suzan Ece
    Oztop, Erhan
    Ugur, Emre
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (04) : 3116 - 3123
  • [43] Improving Out-of-Distribution Detection by Learning From the Deployment Environment
    Inkawhich, Nathan
    Zhang, Jingyang
    Davis, Eric K.
    Luley, Ryan
    Chen, Yiran
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 2070 - 2086
  • [44] Equivariant Learning for Out-of-Distribution Cold-start Recommendation
    Wang, Wenjie
    Lin, Xinyu
    Wang, Liuhui
    Feng, Fuli
    Wei, Yinwei
    Chua, Tat-Seng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 903 - 914
  • [45] Out-of-Distribution Detection via outlier exposure in federated learning
    Jeong, Gu-Bon
    Choi, Dong-Wan
    NEURAL NETWORKS, 2025, 185
  • [46] Out-of-Distribution Material Property Prediction Using Adversarial Learning
    Li, Qinyang
    Miklaucic, Nicholas
    Hu, Jianjun
    JOURNAL OF PHYSICAL CHEMISTRY C, 2025, 129 (13): : 6372 - 6385
  • [47] On the Learnability of Out-of-distribution Detection
    Fang, Zhen
    Li, Yixuan
    Liu, Feng
    Han, Bo
    Lu, Jie
    Journal of Machine Learning Research, 2024, 25
  • [48] Panoptic Out-of-Distribution Segmentation
    Mohan, Rohit
    Kumaraswamy, Kiran
    Hurtado, Juana Valeria
    Petek, Kursat
    Valada, Abhinav
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (05) : 4075 - 4082
  • [49] Certifiable Out-of-Distribution Generalization
    Ye, Nanyang
    Zhu, Lin
    Wang, Jia
    Zeng, Zhaoyu
    Shao, Jiayao
    Peng, Chensheng
    Pan, Bikang
    Li, Kaican
    Zhu, Jun
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10927 - 10935
  • [50] Entropic Out-of-Distribution Detection
    Macedo, David
    Ren, Tsang Ing
    Zanchettin, Cleber
    Oliveira, Adriano L., I
    Ludermir, Teresa
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,