Assigning functions to unknown proteins is one of the most important problems in proteomics. Several approaches have used protein-protein interaction data to predict protein functions. We previously developed a Markov random field (MRF) based method to infer a protein's functions using protein-protein interaction data and the functional annotations of its protein interaction partners. In the original model, only direct interactions were considered and each function was considered separately. In this study, we develop a new model which extends direct interactions to all neighboring proteins, and one function to multiple functions. The goal is to understand a protein's function based on information on all the neighboring proteins in the interaction network. We first developed a novel kernel logistic regression (KLR) method based on diffusion kernels for protein interaction networks. The diffusion kernels provide means to incorporate all neighbors of proteins in the network. Second, we identified a set of functions that are highly correlated with the function of interest, referred to as the correlated functions, using the chi-square test. Third, the correlated functions were incorporated into our new KLR model. Fourth, we extended our model by incorporating multiple biological data sources such as protein domains, protein complexes, and gene expressions by converting them into networks. We showed that the KLR approach of incorporating all protein neighbors significantly improved the accuracy of protein function predictions over the MRF model. The incorporation of multiple data sets also improved prediction accuracy. The prediction accuracy is comparable to another protein function classifier based on the support vector machine (SVM), using a diffusion kernel. The advantages of the KLR model include its simplicity as well as its ability to explore the contribution of neighbors to the functions of proteins of interest.
机构:
Hong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R ChinaHong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R China
Guo, Keli
Fan, Jun
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R ChinaHong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R China
Fan, Jun
Zhu, Lixing
论文数: 0引用数: 0
h-index: 0
机构:
Beijing Normal Univ, Ctr Stat & Data Sci, Zhuhai 519087, Peoples R ChinaHong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R China
机构:
City Univ Hong Kong, Dept Math, Kowloon, Hong Kong, Peoples R ChinaCity Univ Hong Kong, Dept Math, Kowloon, Hong Kong, Peoples R China
Yu, Zhan
Ho, Daniel W. C.
论文数: 0引用数: 0
h-index: 0
机构:
City Univ Hong Kong, Dept Math, Kowloon, Hong Kong, Peoples R ChinaCity Univ Hong Kong, Dept Math, Kowloon, Hong Kong, Peoples R China
Ho, Daniel W. C.
Shi, Zhongjie
论文数: 0引用数: 0
h-index: 0
机构:
City Univ Hong Kong, Sch Data Sci, Kowloon, Hong Kong, Peoples R ChinaCity Univ Hong Kong, Dept Math, Kowloon, Hong Kong, Peoples R China
Shi, Zhongjie
Zhou, Ding-Xuan
论文数: 0引用数: 0
h-index: 0
机构:
City Univ Hong Kong, Sch Data Sci, Dept Math, Liu Bie Ju Ctr Math Sci,Kowloon, Hong Kong, Peoples R ChinaCity Univ Hong Kong, Dept Math, Kowloon, Hong Kong, Peoples R China