Self-Supervised Pre-Training via Multi-View Graph Information Bottleneck for Molecular Property Prediction

被引:1
|
作者
Zang, Xuan [1 ]
Zhang, Junjie [1 ]
Tang, Buzhou [2 ,3 ]
机构
[1] Harbin Inst Technol Shenzhen, Sch Comp Sci & Technol, Shenzhen 518000, Peoples R China
[2] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518000, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518066, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Task analysis; Drugs; Graph neural networks; Representation learning; Perturbation methods; Message passing; Data mining; Drug analysis; graph neural networks; molecular property prediction; molecular pre-training;
D O I
10.1109/JBHI.2024.3422488
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Molecular representation learning has remarkably accelerated the development of drug analysis and discovery. It implements machine learning methods to encode molecule embeddings for diverse downstream drug-related tasks. Due to the scarcity of labeled molecular data, self-supervised molecular pre-training is promising as it can handle large-scale unlabeled molecular data to prompt representation learning. Although many universal graph pre-training methods have been successfully introduced into molecular learning, there still exist some limitations. Many graph augmentation methods, such as atom deletion and bond perturbation, tend to destroy the intrinsic properties and connections of molecules. In addition, identifying subgraphs that are important to specific chemical properties is also challenging for molecular learning. To address these limitations, we propose the self-supervised Molecular Graph Information Bottleneck (MGIB) model for molecular pre-training. MGIB observes molecular graphs from the atom view and the motif view, deploys a learnable graph compression process to extract the core subgraphs, and extends the graph information bottleneck into the self-supervised molecular pre-training framework. Model analysis validates the contribution of the self-supervised graph information bottleneck and illustrates the interpretability of MGIB through the extracted subgraphs. Extensive experiments involving molecular property prediction, including 7 binary classification tasks and 6 regression tasks demonstrate the effectiveness and superiority of our proposed MGIB.
引用
收藏
页码:7659 / 7669
页数:11
相关论文
共 50 条
  • [41] Exploring complementary information of self-supervised pretext tasks for unsupervised video pre-training
    Zhou, Wei
    Hou, Yi
    Ouyang, Kewei
    Zhou, Shilin
    IET COMPUTER VISION, 2022, 16 (03) : 255 - 265
  • [42] Object Adaptive Self-Supervised Dense Visual Pre-Training
    Zhang, Yu
    Zhang, Tao
    Zhu, Hongyuan
    Chen, Zihan
    Mi, Siya
    Peng, Xi
    Geng, Xin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 2228 - 2240
  • [43] UniVIP: A Unified Framework for Self-Supervised Visual Pre-training
    Li, Zhaowen
    Zhu, Yousong
    Yang, Fan
    Li, Wei
    Zhao, Chaoyang
    Chen, Yingying
    Chen, Zhiyang
    Xie, Jiahao
    Wu, Liwei
    Zhao, Rui
    Tang, Ming
    Wang, Jinqiao
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14607 - 14616
  • [44] Representation Recovering for Self-Supervised Pre-training on Medical Images
    Yan, Xiangyi
    Naushad, Junayed
    Sun, Shanlin
    Han, Kun
    Tang, Hao
    Kong, Deying
    Ma, Haoyu
    You, Chenyu
    Xie, Xiaohui
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2684 - 2694
  • [45] Reducing Domain mismatch in Self-supervised speech pre-training
    Baskar, Murali Karthick
    Rosenberg, Andrew
    Ramabhadran, Bhuvana
    Zhang, Yu
    INTERSPEECH 2022, 2022, : 3028 - 3032
  • [46] Dense Contrastive Learning for Self-Supervised Visual Pre-Training
    Wang, Xinlong
    Zhang, Rufeng
    Shen, Chunhua
    Kong, Tao
    Li, Lei
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3023 - 3032
  • [47] Self-supervised VICReg pre-training for Brugada ECG detection
    Ronan, Robert
    Tarabanis, Constantine
    Chinitz, Larry
    Jankelson, Lior
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [48] A Self-Supervised Pre-Training Method for Chinese Spelling Correction
    Su J.
    Yu S.
    Hong X.
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2023, 51 (09): : 90 - 98
  • [49] Self-Supervised Deep Multi-View Subspace Clustering
    Sun, Xiukun
    Cheng, Miaomiao
    Min, Chen
    Jing, Liping
    ASIAN CONFERENCE ON MACHINE LEARNING, VOL 101, 2019, 101 : 1001 - 1016
  • [50] Self-supervised pre-training on industrial time-series
    Biggio, Luca
    Kastanis, Iason
    2021 8TH SWISS CONFERENCE ON DATA SCIENCE, SDS, 2021, : 56 - 57