Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines

被引:41
|
作者
Tian, Jian [1 ]
Wu, Ningfeng [1 ]
Guo, Xuexia [2 ]
Guo, Jun [1 ]
Zhang, Juhua [3 ]
Fan, Yunliu [1 ]
机构
[1] Chinese Acad Agr Sci, Biotechnol Res Inst, Beijing 100081, Peoples R China
[2] Acad Planning & Designing, Minist Agr, Agr Byprod Proc Res Inst, Beijing 100026, Peoples R China
[3] Beijing Inst Technol, Dept Biomed Engn, Beijing 100081, Peoples R China
来源
BMC BIOINFORMATICS | 2007年 / 8卷
关键词
MULTIPLE SEQUENCE ALIGNMENT; PROTEIN STABILITY CHANGES; GENE MUTATION DATABASE; ACID INDEX DATABASE; EVOLUTIONARY INFORMATION; MISSENSE MUTATIONS; EXPRESSION DATA; CLASSIFICATION; SNPS; IDENTIFICATION;
D O I
10.1186/1471-2105-8-450
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Human genetic variations primarily result from single nucleotide polymorphisms (SNPs) that occur approximately every 1000 bases in the overall human population. The non-synonymous SNPs (nsSNPs) that lead to amino acid changes in the protein product may account for nearly half of the known genetic variations linked to inherited human diseases. One of the key problems of medical genetics today is to identify nsSNPs that underlie disease-related phenotypes in humans. As such, the development of computational tools that can identify such nsSNPs would enhance our understanding of genetic diseases and help predict the disease. Results: We propose a method, named Parepro (Predicting the amino acid replacement probability), to identify nsSNPs having either deleterious or neutral effects on the resulting protein function. Two independent datasets, HumVar and NewHumVar, taken from the PhD-SNP server, were applied to train the model and test the robustness of Parepro. Using a 20-fold cross validation test on the HumVar dataset, Parepro achieved a Matthews correlation coefficient (MCC) of 50% and an overall accuracy (Q2) of 76%, both of which were higher than those predicted by the methods, such as PolyPhen, SIFT, and HydridMeth. Further analysis on an additional dataset (NewHumVar) using Parepro yielded similar results. Conclusion: The performance of Parepro indicates that it is a powerful tool for predicting the effect of nsSNPs on protein function and would be useful for large-scale analysis of genomic nsSNP data.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines
    Jian Tian
    Ningfeng Wu
    Xuexia Guo
    Jun Guo
    Juhua Zhang
    Yunliu Fan
    BMC Bioinformatics, 8
  • [2] Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information
    Bao, L
    Cui, Y
    BIOINFORMATICS, 2005, 21 (10) : 2185 - 2190
  • [3] Ranking non-synonymous single nucleotide polymorphisms based on disease concepts
    Hashem A Shihab
    Julian Gough
    Matthew Mort
    David N Cooper
    Ian NM Day
    Tom R Gaunt
    Human Genomics, 8
  • [4] Ranking non-synonymous single nucleotide polymorphisms based on disease concepts
    Shihab, Hashem A.
    Gough, Julian
    Mort, Matthew
    Cooper, David N.
    Day, Ian N. M.
    Gaunt, Tom R.
    HUMAN GENOMICS, 2014, 8
  • [5] Predicting deleterious non-synonymous single nucleotide polymorphisms in signal peptides based on hybrid sequence attributes
    Qin, Wenli
    Li, Yizhou
    Li, Juan
    Yu, Lezheng
    Wu, Di
    Jing, Runyu
    Pu, Xuemei
    Guo, Yanzhi
    Li, Menglong
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2012, 36 : 31 - 35
  • [6] The Effects of Non-Synonymous Single Nucleotide Polymorphisms (nsSNPs) on Protein Protein Interactions
    Yates, Christopher M.
    Sternberg, Michael J. E.
    JOURNAL OF MOLECULAR BIOLOGY, 2013, 425 (21) : 3949 - 3963
  • [7] Predicting the functional consequences of non-synonymous single nucleotide polymorphisms in IL8 gene
    Tikam Chand Dakal
    Deepak Kala
    Gourav Dhiman
    Vinod Yadav
    Andrey Krokhotin
    Nikolay V. Dokholyan
    Scientific Reports, 7
  • [8] Predicting the functional consequences of non-synonymous single nucleotide polymorphisms in IL8 gene
    Dakal, Tikam Chand
    Kala, Deepak
    Dhiman, Gourav
    Yadav, Vinod
    Krokhotin, Andrey
    Dokholyan, Nikolay V.
    SCIENTIFIC REPORTS, 2017, 7
  • [9] Knowledge-based computational mutagenesis for predicting the disease potential of human non-synonymous single nucleotide polymorphisms
    Masso, Majid
    Vaisman, Iosif I.
    JOURNAL OF THEORETICAL BIOLOGY, 2010, 266 (04) : 560 - 568
  • [10] Towards a structural basis of human non-synonymous single nucleotide polymorphisms
    Sunyaev, S
    Ramensky, V
    Bork, P
    TRENDS IN GENETICS, 2000, 16 (05) : 198 - 200