A sequence-based computational method for prediction of MoRFs

被引:6
|
作者
Wang, Yu [1 ]
Guo, Yanzhi [1 ]
Pu, Xuemei [1 ]
Li, Menglong [1 ]
机构
[1] Sichuan Univ, Coll Chem, Chengdu 610064, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
MOLECULAR RECOGNITION FEATURES; INTRINSICALLY DISORDERED PROTEINS; SECONDARY STRUCTURE; WEB SERVER; BINDING; REGIONS; KNN;
D O I
10.1039/c6ra27161h
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Molecular recognition features (MoRFs) are relatively short segments (10-70 residues) within intrinsically disordered regions (IDRs) that can undergo disorder-to-order transitions during binding to partner proteins. Since MoRFs play key roles in important biological processes such as signaling and regulation, identifying them is crucial for a full understanding of the functional aspects of the IDRs. However, given the relative sparseness of MoRFs in protein sequences, the accuracy of the available MoRF predictors is often inadequate for practical usage, which leaves a significant need and room for improvement. In this work, we developed a novel sequence-based predictor for MoRFs using a support vector machine (SVM) algorithm. First, we constructed a comprehensive dataset of annotated MoRFs with the wide length between 10 and 70 residues. Our method firstly utilized the flanking regions to define the negative samples. Then, amino acid composition (AAC) and two previously unexplored features including composition, transition and distribution (CTD) and K nearest neighbors (KNN) score were used to characterize sequence information of MoRFs. Finally, using five-fold cross-validation, an overall accuracy of 75.75% was achieved through feature evaluation and optimization. When performed on an independent test set of 110 proteins, the method also yielded a promising accuracy of 64.98%. Additionally, through external validation on the negative samples, our method still shows comparative performance with other existing methods. We believe that this study will be useful in elucidating the mechanism of MoRFs and facilitating hypothesis-driven experimental design and validation.
引用
收藏
页码:18937 / 18945
页数:9
相关论文
共 50 条
  • [31] De novo sequence-based method for ncRPI prediction using structural information
    Leone, Michele
    Galvani, Marta
    Masseroli, Marco
    2019 IEEE 19TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2019, : 146 - 151
  • [32] A Comprehensive Comparative Review of Protein Sequence-Based Computational Prediction Models of Lysine Succinylation Sites
    Tasmia, Samme Amena
    Kibria, Md. Kaderi
    Islam, Md. Ariful
    Khatun, Mst Shamima
    Mollah, Md. Nurul Haque
    CURRENT PROTEIN & PEPTIDE SCIENCE, 2022, 23 (11) : 744 - 756
  • [33] SeqTMPPI: Sequence-Based Transmembrane Protein Interaction Prediction
    Wang, Han
    Jiang, Jiuhong
    Chen, Qiufen
    Zhang, Chunhua
    Lu, Chang
    Ma, Zhiqiang
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 96 - 99
  • [34] EBGW_OMP: A Sequence-based Method for Accurate Prediction of Outer Membrane Proteins
    Zou, Lingyun
    Ni, Qingshan
    Hu, Fuquan
    2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2014,
  • [35] Sequence-based prediction in conceptual design of bridges - Discussion
    Fu, CC
    JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 1999, 13 (01) : 54 - 54
  • [36] BPP: a sequence-based algorithm for branch point prediction
    Zhang, Qing
    Fan, Xiaodan
    Wang, Yejun
    Sun, Ming-an
    Shao, Jianlin
    Guo, Dianjing
    BIOINFORMATICS, 2017, 33 (20) : 3166 - 3172
  • [37] A Sequence-Based Computational Model for the Prediction of the Solvent Accessible Surface Area for α-Helix and β-Barrel Transmembrane Residues
    Wang, Chengqi
    Xi, Lili
    Li, Shuyan
    Liu, Huanxiang
    Yao, Xiaojun
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 2012, 33 (01) : 11 - 17
  • [38] Sequence-Based Prediction of Type III Secreted Proteins
    Arnold, Roland
    Brandmaier, Stefan
    Kleine, Frederick
    Tischler, Patrick
    Heinz, Eva
    Behrens, Sebastian
    Niinikoski, Antti
    Mewes, Hans-Werner
    Horn, Matthias
    Rattei, Thomas
    PLOS PATHOGENS, 2009, 5 (04)
  • [39] Sequence-based prediction in conceptual design of bridges - Closure
    Wang, WY
    Gero, JS
    JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 1999, 13 (01) : 55 - 56
  • [40] Sequence-Based Prediction of Promiscuous Acyltransferase Activity in Hydrolases
    Mueller, Henrik
    Becker, Ann-Kristin
    Palm, Gottfried J.
    Berndt, Leona
    Badenhorst, Christoffel P. S.
    Godehard, Simon P.
    Reisky, Lukas
    Lammers, Michael
    Bornscheuer, Uwe T.
    ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2020, 59 (28) : 11607 - 11612