Self-Supervised Pre-Training via Multi-View Graph Information Bottleneck for Molecular Property Prediction

被引：1

作者：

Zang, Xuan ^{[1
]}

Zhang, Junjie ^{[1
]}

Tang, Buzhou ^{[2
,3
]}

机构：

[1] Harbin Inst Technol Shenzhen, Sch Comp Sci & Technol, Shenzhen 518000, Peoples R China

[2] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518000, Peoples R China

[3] Peng Cheng Lab, Shenzhen 518066, Peoples R China

来源：

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS | 2024年 / 28卷 / 12期

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Task analysis; Drugs; Graph neural networks; Representation learning; Perturbation methods; Message passing; Data mining; Drug analysis; graph neural networks; molecular property prediction; molecular pre-training;

D O I：

10.1109/JBHI.2024.3422488

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Molecular representation learning has remarkably accelerated the development of drug analysis and discovery. It implements machine learning methods to encode molecule embeddings for diverse downstream drug-related tasks. Due to the scarcity of labeled molecular data, self-supervised molecular pre-training is promising as it can handle large-scale unlabeled molecular data to prompt representation learning. Although many universal graph pre-training methods have been successfully introduced into molecular learning, there still exist some limitations. Many graph augmentation methods, such as atom deletion and bond perturbation, tend to destroy the intrinsic properties and connections of molecules. In addition, identifying subgraphs that are important to specific chemical properties is also challenging for molecular learning. To address these limitations, we propose the self-supervised Molecular Graph Information Bottleneck (MGIB) model for molecular pre-training. MGIB observes molecular graphs from the atom view and the motif view, deploys a learnable graph compression process to extract the core subgraphs, and extends the graph information bottleneck into the self-supervised molecular pre-training framework. Model analysis validates the contribution of the self-supervised graph information bottleneck and illustrates the interpretability of MGIB through the extracted subgraphs. Extensive experiments involving molecular property prediction, including 7 binary classification tasks and 6 regression tasks demonstrate the effectiveness and superiority of our proposed MGIB.

引用

页码：7659 / 7669

页数：11

共 50 条

[41] Exploring complementary information of self-supervised pretext tasks for unsupervised video pre-training
Zhou, Wei
Hou, Yi
Ouyang, Kewei
Zhou, Shilin
IET COMPUTER VISION, 2022, 16 (03) : 255 - 265
[42] Object Adaptive Self-Supervised Dense Visual Pre-Training
Zhang, Yu
Zhang, Tao
Zhu, Hongyuan
Chen, Zihan
Mi, Siya
Peng, Xi
Geng, Xin
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 2228 - 2240
[43] UniVIP: A Unified Framework for Self-Supervised Visual Pre-training
Li, Zhaowen
Zhu, Yousong
Yang, Fan
Li, Wei
Zhao, Chaoyang
Chen, Yingying
Chen, Zhiyang
Xie, Jiahao
Wu, Liwei
Zhao, Rui
Tang, Ming
Wang, Jinqiao
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14607 - 14616
[44] Representation Recovering for Self-Supervised Pre-training on Medical Images
Yan, Xiangyi
Naushad, Junayed
Sun, Shanlin
Han, Kun
Tang, Hao
Kong, Deying
Ma, Haoyu
You, Chenyu
Xie, Xiaohui
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2684 - 2694
[45] Reducing Domain mismatch in Self-supervised speech pre-training
Baskar, Murali Karthick
Rosenberg, Andrew
Ramabhadran, Bhuvana
Zhang, Yu
INTERSPEECH 2022, 2022, : 3028 - 3032
[46] Dense Contrastive Learning for Self-Supervised Visual Pre-Training
Wang, Xinlong
Zhang, Rufeng
Shen, Chunhua
Kong, Tao
Li, Lei
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3023 - 3032
[47] Self-supervised VICReg pre-training for Brugada ECG detection
Ronan, Robert
Tarabanis, Constantine
Chinitz, Larry
Jankelson, Lior
SCIENTIFIC REPORTS, 2025, 15 (01):
[48] A Self-Supervised Pre-Training Method for Chinese Spelling Correction
Su J.
Yu S.
Hong X.
Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2023, 51 (09): : 90 - 98
[49] Self-Supervised Deep Multi-View Subspace Clustering
Sun, Xiukun
Cheng, Miaomiao
Min, Chen
Jing, Liping
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 101, 2019, 101 : 1001 - 1016
[50] Self-supervised pre-training on industrial time-series
Biggio, Luca
Kastanis, Iason
2021 8TH SWISS CONFERENCE ON DATA SCIENCE, SDS, 2021, : 56 - 57

← 1 2 3 4 5 →