Improving compound-protein interaction prediction by building up highly credible negative samples

被引：214

作者：

Liu, Hui ^{[1
,2
]}

Sun, Jianjiang ^{[3
,4
]}

Guan, Jihong ^{[5
]}

Zheng, Jie ^{[2
]}

Zhou, Shuigeng ^{[3
,4
]}

机构：

[1] Changzhou Univ, Lab Informat Management, Changzhou 213164, Jiangsu, Peoples R China

[2] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore

[3] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Shanghai 200433, Peoples R China

[4] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China

[5] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China

来源：

BIOINFORMATICS | 2015年 / 31卷 / 12期

基金：

中国国家自然科学基金;

关键词：

DRUG-TARGET INTERACTIONS; INTERACTION NETWORKS; IDENTIFICATION; INTEGRATION; RESOURCE; KERNELS; MODE;

D O I：

10.1093/bioinformatics/btv256

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: Computational prediction of compound-protein interactions (CPIs) is of great importance for drug design and development, as genome-scale experimental validation of CPIs is not only time-consuming but also prohibitively expensive. With the availability of an increasing number of validated interactions, the performance of computational prediction approaches is severely impended by the lack of reliable negative CPI samples. A systematic method of screening reliable negative sample becomes critical to improving the performance of in silico prediction methods. Results: This article aims at building up a set of highly credible negative samples of CPIs via an in silico screening method. As most existing computational models assume that similar compounds are likely to interact with similar target proteins and achieve remarkable performance, it is rational to identify potential negative samples based on the converse negative proposition that the proteins dissimilar to every known/predicted target of a compound are not much likely to be targeted by the compound and vice versa. We integrated various resources, including chemical structures, chemical expression profiles and side effects of compounds, amino acid sequences, protein-protein interaction network and functional annotations of proteins, into a systematic screening framework. We first tested the screened negative samples on six classical classifiers, and all these classifiers achieved remarkably higher performance on our negative samples than on randomly generated negative samples for both human and Caenorhabditis elegans. We then verified the negative samples on three existing prediction models, including bipartite local model, Gaussian kernel profile and Bayesian matrix factorization, and found that the performances of these models are also significantly improved on the screened negative samples. Moreover, we validated the screened negative samples on a drug bioactivity dataset. Finally, we derived two sets of new interactions by training an support vector machine classifier on the positive interactions annotated in DrugBank and our screened negative interactions. The screened negative samples and the predicted interactions provide the research community with a useful resource for identifying new drug targets and a helpful supplement to the current curated compound-protein databases.

引用

页码：221 / 229

页数：9

共 50 条

[1] Improving Compound-Protein Interaction Prediction by Self-Training with Augmenting Negative Samples
Koyama, Takuto
Matsumoto, Shigeyuki
Iwata, Hiroaki
Kojima, Ryosuke
Okuno, Yasushi
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 63 (15) : 4552 - 4559
[2] Yuel: Improving the Generalizability of Structure-Free Compound-Protein Interaction Prediction
Wang, Jian
Dokholyan, Nikolay, V
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2022, 62 (03) : 463 - 471
[3] GraphsformerCPI: Graph Transformer for Compound-Protein Interaction Prediction
Ma, Jun
Zhao, Zhili
Li, Tongfeng
Liu, Yunwu
Ma, Jun
Zhang, Ruisheng
INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2024, 16 (02) : 361 - 377
[4] Boosting compound-protein interaction prediction by deep learning
Tian, Kai
Shao, Mingyu
Wang, Yang
Guan, Jihong
Zhou, Shuigeng
METHODS, 2016, 110 : 64 - 72
[5] CPInformer for Efficient and Robust Compound-Protein Interaction Prediction
Hua, Yang
Song, Xiaoning
Feng, Zhenhua
Wu, Xiao-Jun
Kittler, Josef
Yu, Dong-Jun
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (01) : 285 - 296
[6] Boosting Compound-Protein Interaction Prediction by Deep Learning
Tian, Kai
Shao, Mingyu
Zhou, Shuigeng
Guan, Jihong
PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 29 - 34
[7] Quasi-Supervised Strategies for Compound-Protein Interaction Prediction
Caki, Onur
Karacali, Bilge
MOLECULAR INFORMATICS, 2022, 41 (04)
[8] Scalable Prediction of Compound-protein Interaction on Compressed Molecular Fingerprints
Tabei, Yasuo
MOLECULAR INFORMATICS, 2020, 39 (1-2)
[9] Insights into performance evaluation of compound-protein interaction prediction methods
Yaseen, Adiba
Amin, Imran
Akhter, Naeem
Ben-Hur, Asa
Minhas, Fayyaz
BIOINFORMATICS, 2022, 38 : ii75 - ii81
[10] Compound-Protein Interaction Prediction with Sparse Perturbation-Aware Attention
Wang, Qiwen
Lin, Chen
Su, Wei
Xiao, Liang
Zeng, Xiangxiang
ADVANCED INTELLIGENT COMPUTING IN BIOINFORMATICS, PT II, ICIC 2024, 2024, 14882 : 72 - 83

← 1 2 3 4 5 →