Combing ontologies and dipeptide composition for predicting DNA-binding proteins

被引:0
|
作者
Loris Nanni
Alessandra Lumini
机构
[1] Università di Bologna,DEIS, IEIIT—CNR
来源
Amino Acids | 2008年 / 34卷
关键词
DNA-binding proteins; Gene ontology; Dipeptide composition; Chou’s pseudo amino acid composition; Multi-classifier;
D O I
暂无
中图分类号
学科分类号
摘要
Given a novel protein it is very important to know if it is a DNA-binding protein, because DNA-binding proteins participate in the fundamental role to regulate gene expression. In this work, we propose a parallel fusion between a classifier trained using the features extracted from the gene ontology database and a classifier trained using the dipeptide composition of the protein. As classifiers the support vector machine (SVM) and the 1-nearest neighbour are used. Matthews’s correlation coefficient obtained by our fusion method is ≈0.97 when the jackknife cross-validation is used; this result outperforms the best performance obtained in the literature (0.924) using the same dataset where the SVM is trained using only the Chou’s pseudo amino acid based features. In this work also the area under the ROC-curve (AUC) is reported and our results show that the fusion permits to obtain a very interesting 0.995 AUC. In particular we want to stress that our fusion obtains a 5% false negative with a 0% of false positive. Matthews’s correlation coefficient obtained using the single best GO-number is only 0.7211 and hence it is not possible to use the gene ontology database as a simple lookup table. Finally, we test the complementarity of the two tested feature extraction methods using the Q-statistic. We obtain the very interesting result of 0.58, which means that the features extracted from the gene ontology database and the features extracted from the amino acid sequence are partially independent and that their parallel fusion should be studied more.
引用
收藏
页码:635 / 641
页数:6
相关论文
共 50 条
  • [1] Combing ontologies and dipeptide composition for predicting DNA-binding proteins
    Nanni, Loris
    Lumini, Alessandra
    AMINO ACIDS, 2008, 34 (04) : 635 - 641
  • [2] Identification of DNA-binding Proteins Using Gapped-dipeptide Composition and Recursive Feature Elimination Algorithm
    Tang Ya-Dong
    Liu Xiao
    Liu Tai-Gang
    Xie Lu
    Chen Lan-Ming
    PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, 2018, 45 (04) : 453 - 459
  • [3] Predicting Functional Interactions Among DNA-Binding Proteins
    Khushi, Matloob
    Choudhury, Nazim
    Arthur, Jonathan W.
    Clarke, Christine L.
    Graham, J. Dinny
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT V, 2018, 11305 : 70 - 80
  • [4] DNA-BINDING BY PROTEINS
    SCHLEIF, R
    SCIENCE, 1988, 241 (4870) : 1182 - 1187
  • [5] DNA-BINDING PROTEINS
    PTASHNE, M
    NATURE, 1984, 308 (5961) : 753 - 754
  • [6] Predicting Target DNA Sequences of DNA-Binding Proteins Based on Unbound Structures
    Chen, Chien-Yu
    Chien, Ting-Ying
    Lin, Chih-Kang
    Lin, Chih-Wei
    Weng, Yi-Zhong
    Chang, Darby Tien-Hao
    PLOS ONE, 2012, 7 (02):
  • [7] Predicting DNA-binding sites of proteins from amino acid sequence
    Changhui Yan
    Michael Terribilini
    Feihong Wu
    Robert L Jernigan
    Drena Dobbs
    Vasant Honavar
    BMC Bioinformatics, 7
  • [8] Predicting DNA-binding sites of proteins from amino acid sequence
    Yan, Changhui
    Terribilini, Michael
    Wu, Feihong
    Jernigan, Robert L.
    Dobbs, Drena
    Honavar, Vasant
    BMC BIOINFORMATICS, 2006, 7 (1)
  • [9] RAPID ISOLATION OF SPECIFIC DNA-BINDING PROTEINS AND THEIR DNA-BINDING DOMAINS
    WICHSER, U
    BRACK, C
    NUCLEIC ACIDS RESEARCH, 1992, 20 (15) : 4103 - 4104
  • [10] 2 DNA-BINDING PROTEINS
    DAVIES, D
    NATURE, 1981, 290 (5809) : 736 - 737