Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information

被引:80
|
作者
Pollastri, Gianluca [1 ]
Martin, Alberto J. M. [1 ]
Mooney, Catherine [1 ]
Vullo, Alessandro [1 ]
机构
[1] Univ Coll Dublin, Sch Informat & Comp Sci, Complex & Adapt Syst Lab, Dublin 4, Ireland
关键词
D O I
10.1186/1471-2105-8-201
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio. Results: Here we develop high-throughput machine learning systems for the prediction of protein secondary structure and solvent accessibility that exploit homology to proteins of known structure, where available, in the form of simple structural frequency profiles extracted from sets of PDB templates. We compare these systems to their state-of-the-art ab initio counterparts, and with a number of baselines in which secondary structures and solvent accessibilities are extracted directly from the templates. We show that structural information from templates greatly improves secondary structure and solvent accessibility prediction quality, and that, on average, the systems significantly enrich the information contained in the templates. For sequence similarity exceeding 30%, secondary structure prediction quality is approximately 90%, close to its theoretical maximum, and 2- class solvent accessibility roughly 85%. Gains are robust with respect to template selection noise, and significant for marginal sequence similarity and for short alignments, supporting the claim that these improved predictions may prove beneficial beyond the case in which clear homology is available. Conclusion: The predictive system are publicly available at the address http://distill.ucd.ie.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information
    Gianluca Pollastri
    Alberto JM Martin
    Catherine Mooney
    Alessandro Vullo
    BMC Bioinformatics, 8
  • [2] Developing structural profile matrices for protein secondary structure and solvent accessibility prediction
    Aydin, Zafer
    Azginoglu, Nuh
    Bilgin, Halil Ibrahim
    Celik, Mete
    BIOINFORMATICS, 2019, 35 (20) : 4004 - 4010
  • [3] Combining prediction of secondary structure and solvent accessibility in proteins
    Adamczak, R
    Porollo, A
    Meller, J
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 59 (03) : 467 - 475
  • [4] New method for accurate prediction of solvent accessibility from protein sequence
    Li, X
    Pan, XM
    PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2001, 42 (01): : 1 - 5
  • [5] Real value prediction of solvent accessibility in proteins using multiple sequence alignment and secondary structure
    Garg, A
    Kaur, H
    Raghava, GPS
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 61 (02) : 318 - 324
  • [6] Determination of protein folding kinetic types using sequence and predicted secondary structure and solvent accessibility
    Zhang, Hua
    Zhang, Tuo
    Gao, Jianzhao
    Ruan, Jishou
    Shen, Shiyi
    Kurgan, Lukasz
    AMINO ACIDS, 2012, 42 (01) : 271 - 283
  • [7] Determination of protein folding kinetic types using sequence and predicted secondary structure and solvent accessibility
    Hua Zhang
    Tuo Zhang
    Jianzhao Gao
    Jishou Ruan
    Shiyi Shen
    Lukasz Kurgan
    Amino Acids, 2012, 42 : 271 - 283
  • [8] CSSP-2.0: A refined consensus method for accurate protein secondary structure prediction
    Sanjeevi, Madhumathi
    Mohan, Ajitha
    Ramachandran, Dhanalakshmi
    Jeyaraman, Jeyakanthan
    Sekar, Kanagaraj
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2024, 112
  • [9] Assessing the impact of secondary structure and solvent accessibility on protein evolution
    Goldman, N
    Thorne, JL
    Jones, DT
    GENETICS, 1998, 149 (01) : 445 - 458
  • [10] Positioning of anchor groups in protein loop prediction: The importance of solvent accessibility and secondary structure elements
    Wohlfahrt, G
    Hangoc, V
    Schomburg, D
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2002, 47 (03) : 370 - 378