Combing ontologies and dipeptide composition for predicting DNA-binding proteins

被引:29
|
作者
Nanni, Loris [1 ]
Lumini, Alessandra [1 ]
机构
[1] Univ Bologna, DEIS, CNR, IEIIT, I-40136 Bologna, Italy
关键词
DNA-binding proteins; gene ontology; dipeptide composition; Chou's pseudo amino acid composition; multi-classifier;
D O I
10.1007/s00726-007-0016-3
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Given a novel protein it is very important to know if it is a DNA-binding protein, because DNA-binding proteins participate in the fundamental role to regulate gene expression. In this work, we propose a parallel fusion between a classifier trained using the features extracted from the gene ontology database and a classifier trained using the dipeptide composition of the protein. As classifiers the support vector machine (SVM) and the 1-nearest neighbour are used. Matthews's correlation coefficient obtained by our fusion method is approximate to 0.97 when the jackknife cross-validation is used; this result outperforms the best performance obtained in the literature (0.924) using the same dataset where the SVM is trained using only the Chou's pseudo amino acid based features. In this work also the area under the ROC-curve (AUC) is reported and our results show that the fusion permits to obtain a very interesting 0.995 AUC. In particular we want to stress that our fusion obtains a 5% false negative with a 0% of false positive. Matthews's correlation coefficient obtained using the single best GO-number is only 0.7211 and hence it is not possible to use the gene ontology database as a simple lookup table. Finally, we test the complementarity of the two tested feature extraction methods using the Q-statistic. We obtain the very interesting result of 0.58, which means that the features extracted from the gene ontology database and the features extracted from the amino acid sequence are partially independent and that their parallel fusion should be studied more.
引用
收藏
页码:635 / 641
页数:7
相关论文
共 50 条
  • [41] Identify DNA-Binding Proteins with Optimal Chou's Amino Acid Composition
    Zhao, Xiao-Wei
    Li, Xiang-Tao
    Ma, Zhi-Qiang
    Yin, Ming-Hao
    PROTEIN AND PEPTIDE LETTERS, 2012, 19 (04): : 398 - 405
  • [42] Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information
    Ahmad, S
    Gromiha, MM
    Sarai, A
    BIOINFORMATICS, 2004, 20 (04) : 477 - 486
  • [43] Two MAR DNA-binding proteins of the pea nuclear matrix identify a new class of DNA-binding proteins
    Hatton, D
    Gray, JC
    PLANT JOURNAL, 1999, 18 (04): : 417 - 429
  • [44] INTERACTION OF VACCINIA DNA-BINDING PROTEINS WITH DNA INVITRO
    POLISKY, B
    KATES, J
    VIROLOGY, 1976, 69 (01) : 143 - 147
  • [45] Phase Behavior of DNA in the Presence of DNA-Binding Proteins
    Le Treut, Guillaume
    Kepes, Francois
    Orland, Henri
    BIOPHYSICAL JOURNAL, 2016, 110 (01) : 51 - 62
  • [46] Predicting DNA-binding proteins: approached from Chou's pseudo amino acid composition and other specific sequence features
    Fang, Y.
    Guo, Y.
    Feng, Y.
    Li, M.
    AMINO ACIDS, 2008, 34 (01) : 103 - 109
  • [47] Predicting DNA-binding proteins: approached from Chou’s pseudo amino acid composition and other specific sequence features
    Y. Fang
    Y. Guo
    Y. Feng
    M. Li
    Amino Acids, 2008, 34 : 103 - 109
  • [48] EFFECTS OF DNA-BINDING PROTEINS ON DNA METHYLATION INVITRO
    KAUTIAINEN, TL
    JONES, PA
    BIOCHEMISTRY, 1985, 24 (05) : 1193 - 1196
  • [49] DNA-BINDING PROTEINS IN CANINE SERA - A METHOD FOR REMOVAL OF NONSPECIFIC DNA-BINDING IN THE FARR ASSAY
    ZEROMSKI, J
    THORENTOLLING, K
    BERGQVIST, R
    STEJSKAL, V
    VETERINARY IMMUNOLOGY AND IMMUNOPATHOLOGY, 1984, 7 (02) : 169 - 183
  • [50] Methylated DNA-binding proteins from arabidopsis
    Ito, M
    Koike, A
    Koizumi, N
    Sano, H
    PLANT PHYSIOLOGY, 2003, 133 (04) : 1747 - 1754