PSC-CPI: Multi-Scale Protein Sequence-Structure Contrasting for Efficient and Generalizable Compound-Protein Interaction Prediction

被引:0
|
作者
Wu, Lirong [1 ,2 ]
Huang, Yufei [1 ,2 ]
Tan, Cheng [1 ,2 ]
Gao, Zhangyang [1 ,2 ]
Hu, Bozhen [1 ,2 ]
Lin, Haitao [1 ,2 ]
Liu, Zicheng [1 ,2 ]
Li, Stan Z. [1 ]
机构
[1] Westlake Univ, Res Ctr Ind Future, AI Lab, Hangzhou 310030, Peoples R China
[2] Zhejiang Univ, Hangzhou 310058, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
REPRESENTATION; DOCKING;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Compound-Protein Interaction (CPI) prediction aims to predict the pattern and strength of compound-protein interactions for rational drug discovery. Existing deep learning-based methods utilize only the single modality of protein sequences or structures and lack the co-modeling of the joint distribution of the two modalities, which may lead to significant performance drops in complex real-world scenarios due to various factors, e.g., modality missing and domain shifting. More importantly, these methods only model protein sequences and structures at a single fixed scale, neglecting more fine-grained multi-scale information, such as those embedded in key protein fragments. In this paper, we propose a novel multi-scale Protein Sequence-structure Contrasting framework for CPI prediction (PSC-CPI), which captures the dependencies between protein sequences and structures through both intra-modality and cross-modality contrasting. We further apply length-variable protein augmentation to allow contrasting to be performed at different scales, from the amino acid level to the sequence level. Finally, in order to more fairly evaluate the model generalizability, we split the test data into four settings based on whether compounds and proteins have been observed during the training stage. Extensive experiments have shown that PSC-CPI generalizes well in all four settings, particularly in the more challenging "Unseen-Both" setting, where neither compounds nor proteins have been observed during training. Furthermore, even when encountering a situation of modality missing, i.e., inference with only single-modality data, PSC-CPI still exhibits comparable or even better performance than previous approaches.
引用
收藏
页码:310 / 319
页数:10
相关论文
共 26 条
  • [1] CPInformer for Efficient and Robust Compound-Protein Interaction Prediction
    Hua, Yang
    Song, Xiaoning
    Feng, Zhenhua
    Wu, Xiao-Jun
    Kittler, Josef
    Yu, Dong-Jun
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (01) : 285 - 296
  • [2] MCN-CPI: Multiscale Convolutional Network for Compound-Protein Interaction Prediction
    Wang, Shuang
    Jiang, Mingjian
    Zhang, Shugang
    Wang, Xiaofeng
    Yuan, Qing
    Wei, Zhiqiang
    Li, Zhen
    BIOMOLECULES, 2021, 11 (08)
  • [3] MDL-CPI: Multi-view deep learning model for compound-protein interaction prediction
    Wei, Lesong
    Long, Wentao
    Wei, Leyi
    METHODS, 2022, 204 : 418 - 427
  • [4] Perceiver CPI: a nested cross-attention network for compound-protein interaction prediction
    Nguyen, Ngoc-Quang
    Jang, Gwanghoon
    Kim, Hajung
    Kang, Jaewoo
    BIOINFORMATICS, 2023, 39 (01)
  • [5] StackCPA: A stacking model for compound-protein binding affinity prediction based on pocket multi-scale features
    Lei, Chuqi
    Lu, Zhangli
    Wang, Meng
    Li, Min
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 164
  • [6] FOTF-CPI: A compound-protein interaction prediction transformer based on the fusion of optimal transport fragments
    Yin, Zeyu
    Chen, Yu
    Hao, Yajie
    Pandiyan, Sanjeevi
    Shao, Jinsong
    Wang, Li
    ISCIENCE, 2024, 27 (01)
  • [7] Yuel: Improving the Generalizability of Structure-Free Compound-Protein Interaction Prediction
    Wang, Jian
    Dokholyan, Nikolay, V
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2022, 62 (03) : 463 - 471
  • [8] FedKD-CPI: Combining the federated knowledge distillation technique to accomplish synergistic compound-protein interaction prediction
    Wang, Xuetao
    Zhao, Qichang
    Wang, Jianxin
    METHODS, 2025, 234 : 275 - 283
  • [9] MMCL-CPI: A multi-modal compound-protein interaction prediction model incorporating contrastive learning pre-training
    Qian, Ying
    Li, Xinyi
    Wu, Jian
    Zhang, Qian
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2024, 112
  • [10] Bridging chemical structure and conceptual knowledge enables accurate prediction of compound-protein interaction
    Tao, Wen
    Lin, Xuan
    Liu, Yuansheng
    Zeng, Li
    Ma, Tengfei
    Cheng, Ning
    Jiang, Jing
    Zeng, Xiangxiang
    Yuan, Sisi
    BMC BIOLOGY, 2024, 22 (01)