SNAP: predict effect of non-synonymous polymorphisms on function

被引:701
作者
Bromberg, Yana
Rost, Burkhard
机构
[1] Columbia Univ, Dept Biochem & Mol Biophys, New York, NY 10032 USA
[2] Columbia Univ, Ctr Computat Biol & Bioinformat, New York, NY 10032 USA
[3] NESG Consortium, New York, NY 10032 USA
[4] Columbia Univ, Dept Biomed Informat, New York, NY 10032 USA
关键词
D O I
10.1093/nar/gkm238
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Many genetic variations are single nucleotide polymorphisms (SNPs). Non-synonymous SNPs are 'neutral' if the resulting point-mutated protein is not functionally discernible from the wild type and 'non-neutral' otherwise. The ability to identify non-neutral substitutions could significantly aid targeting disease causing detrimental mutations, as well as SNPs that increase the fitness of particular phenotypes. Here, we introduced comprehensive data sets to assess the performance of methods that predict SNP effects. Along we introduced SNAP (screening for non-acceptable polymorphisms), a neural network-based method for the prediction of the functional effects of non-synonymous SNPs. SNAP needs only sequence information as input, but benefits from functional and structural annotations, if available. In a cross-validation test on over 80 000 mutants, SNAP identified 80% of the non-neutral substitutions at 77% accuracy and 76% of the neutral substitutions at 80% accuracy. This constituted an important improvement over other methods; the improvement rose to over ten percentage points for mutants for which existing methods disagreed. Possibly even more importantly SNAP introduced a well-calibrated measure for the reliability of each prediction. This measure will allow users to focus on the most accurate predictions and/or the most severe effects.
引用
收藏
页码:3823 / 3835
页数:13
相关论文
共 58 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 2005, The proteomics protocols handbook. Totowa (New Jersey)
[3]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkh131, 10.1093/nar/gkw1099]
[4]   The universal protein resource (UniProt) [J].
Bairoch, A ;
Apweiler, R ;
Wu, CH ;
Barker, WC ;
Boeckmann, B ;
Ferro, S ;
Gasteiger, E ;
Huang, HZ ;
Lopez, R ;
Magrane, M ;
Martin, MJ ;
Natale, DA ;
O'Donovan, C ;
Redaschi, N ;
Yeh, LSL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D154-D159
[5]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[6]   Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information [J].
Bao, L ;
Cui, Y .
BIOINFORMATICS, 2005, 21 (10) :2185-2190
[7]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkp985, 10.1093/nar/gkh121]
[8]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[9]  
CAPRIOTTI E, 2004, BIOINFORMATICS, V1, P1
[10]   The transcriptional landscape of the mammalian genome [J].
Carninci, P ;
Kasukawa, T ;
Katayama, S ;
Gough, J ;
Frith, MC ;
Maeda, N ;
Oyama, R ;
Ravasi, T ;
Lenhard, B ;
Wells, C ;
Kodzius, R ;
Shimokawa, K ;
Bajic, VB ;
Brenner, SE ;
Batalov, S ;
Forrest, ARR ;
Zavolan, M ;
Davis, MJ ;
Wilming, LG ;
Aidinis, V ;
Allen, JE ;
Ambesi-Impiombato, X ;
Apweiler, R ;
Aturaliya, RN ;
Bailey, TL ;
Bansal, M ;
Baxter, L ;
Beisel, KW ;
Bersano, T ;
Bono, H ;
Chalk, AM ;
Chiu, KP ;
Choudhary, V ;
Christoffels, A ;
Clutterbuck, DR ;
Crowe, ML ;
Dalla, E ;
Dalrymple, BP ;
de Bono, B ;
Della Gatta, G ;
di Bernardo, D ;
Down, T ;
Engstrom, P ;
Fagiolini, M ;
Faulkner, G ;
Fletcher, CF ;
Fukushima, T ;
Furuno, M ;
Futaki, S ;
Gariboldi, M .
SCIENCE, 2005, 309 (5740) :1559-1563