Diffusion kernel-based logistic regression models for protein function prediction

被引:70
|
作者
Lee, Hyunju
Tu, Zhidong
Deng, Minghua
Sun, Fengzhu
Chen, Ting
机构
[1] Univ So Calif, Mol & Computat Biol Program, Los Angeles, CA 90089 USA
[2] Univ So Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
[3] Peking Univ, Sch Math Sci, LMAM, Beijing 100871, Peoples R China
[4] Peking Univ, Ctr Theoret Biol, Beijing 100871, Peoples R China
关键词
D O I
10.1089/omi.2006.10.40
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Assigning functions to unknown proteins is one of the most important problems in proteomics. Several approaches have used protein-protein interaction data to predict protein functions. We previously developed a Markov random field (MRF) based method to infer a protein's functions using protein-protein interaction data and the functional annotations of its protein interaction partners. In the original model, only direct interactions were considered and each function was considered separately. In this study, we develop a new model which extends direct interactions to all neighboring proteins, and one function to multiple functions. The goal is to understand a protein's function based on information on all the neighboring proteins in the interaction network. We first developed a novel kernel logistic regression (KLR) method based on diffusion kernels for protein interaction networks. The diffusion kernels provide means to incorporate all neighbors of proteins in the network. Second, we identified a set of functions that are highly correlated with the function of interest, referred to as the correlated functions, using the chi-square test. Third, the correlated functions were incorporated into our new KLR model. Fourth, we extended our model by incorporating multiple biological data sources such as protein domains, protein complexes, and gene expressions by converting them into networks. We showed that the KLR approach of incorporating all protein neighbors significantly improved the accuracy of protein function predictions over the MRF model. The incorporation of multiple data sets also improved prediction accuracy. The prediction accuracy is comparable to another protein function classifier based on the support vector machine (SVM), using a diffusion kernel. The advantages of the KLR model include its simplicity as well as its ability to explore the contribution of neighbors to the functions of proteins of interest.
引用
收藏
页码:40 / 55
页数:16
相关论文
共 50 条
  • [31] Robust kernel-based regression with bounded influence for outliers
    Hwang, Sangheum
    Kim, Dohyun
    Jeong, Myong K.
    Yum, Bong-Jin
    JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2015, 66 (08) : 1385 - 1398
  • [32] Reproducing kernel-based functional linear expectile regression
    Liu, Meichen
    Pietrosanu, Matthew
    Liu, Peng
    Jiang, Bei
    Zhou, Xingcai
    Kong, Linglong
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2022, 50 (01): : 241 - 266
  • [33] Kernel-based estimation of semiparametric regression in triangular systems
    Martins-Filho, Carlos
    Yao, Feng
    ECONOMICS LETTERS, 2012, 115 (01) : 24 - 27
  • [34] On the Convergence Rate of Kernel-Based Sequential Greedy Regression
    Wang, Xiaoyin
    Wei, Xiaoyan
    Pan, Zhibin
    ABSTRACT AND APPLIED ANALYSIS, 2012,
  • [35] Kernel-based data fusion improves the drug-protein interaction prediction
    Wang, Yong-Cui
    Zhang, Chun-Hua
    Deng, Nai-Yang
    Wang, Yong
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2011, 35 (06) : 353 - 362
  • [36] Susceptibility mapping of shallow landslides using kernel-based Gaussian process, support vector machines and logistic regression
    Colkesen, Ismail
    Sahin, Emrehan Kutlug
    Kavzoglu, Taskin
    JOURNAL OF AFRICAN EARTH SCIENCES, 2016, 118 : 53 - 64
  • [38] High-dimensional time series prediction using kernel-based Koopman mode regression
    Hua, Jia-Chen
    Noorian, Farzad
    Moss, Duncan
    Leong, Philip H. W.
    Gunaratne, Gemunu H.
    NONLINEAR DYNAMICS, 2017, 90 (03) : 1785 - 1806
  • [39] High-dimensional time series prediction using kernel-based Koopman mode regression
    Jia-Chen Hua
    Farzad Noorian
    Duncan Moss
    Philip H. W. Leong
    Gemunu H. Gunaratne
    Nonlinear Dynamics, 2017, 90 : 1785 - 1806
  • [40] A kernel-based PEM estimator for forward models
    Fattore, Giulio
    Peruzzo, Marco
    Sartori, Giacomo
    Zorzi, Mattia
    IFAC PAPERSONLINE, 2024, 58 (15): : 31 - 36