A sequence-based computational method for prediction of MoRFs

被引:6
|
作者
Wang, Yu [1 ]
Guo, Yanzhi [1 ]
Pu, Xuemei [1 ]
Li, Menglong [1 ]
机构
[1] Sichuan Univ, Coll Chem, Chengdu 610064, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
MOLECULAR RECOGNITION FEATURES; INTRINSICALLY DISORDERED PROTEINS; SECONDARY STRUCTURE; WEB SERVER; BINDING; REGIONS; KNN;
D O I
10.1039/c6ra27161h
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Molecular recognition features (MoRFs) are relatively short segments (10-70 residues) within intrinsically disordered regions (IDRs) that can undergo disorder-to-order transitions during binding to partner proteins. Since MoRFs play key roles in important biological processes such as signaling and regulation, identifying them is crucial for a full understanding of the functional aspects of the IDRs. However, given the relative sparseness of MoRFs in protein sequences, the accuracy of the available MoRF predictors is often inadequate for practical usage, which leaves a significant need and room for improvement. In this work, we developed a novel sequence-based predictor for MoRFs using a support vector machine (SVM) algorithm. First, we constructed a comprehensive dataset of annotated MoRFs with the wide length between 10 and 70 residues. Our method firstly utilized the flanking regions to define the negative samples. Then, amino acid composition (AAC) and two previously unexplored features including composition, transition and distribution (CTD) and K nearest neighbors (KNN) score were used to characterize sequence information of MoRFs. Finally, using five-fold cross-validation, an overall accuracy of 75.75% was achieved through feature evaluation and optimization. When performed on an independent test set of 110 proteins, the method also yielded a promising accuracy of 64.98%. Additionally, through external validation on the negative samples, our method still shows comparative performance with other existing methods. We believe that this study will be useful in elucidating the mechanism of MoRFs and facilitating hypothesis-driven experimental design and validation.
引用
收藏
页码:18937 / 18945
页数:9
相关论文
共 50 条
  • [1] MoRFPred_en: Sequence-based prediction of MoRFs using an ensemble learning strategy
    Fang, Chun
    Moriwaki, Yoshitaka
    Li, Caihong
    Shimizu, Kentaro
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2019, 17 (06)
  • [2] Two decades of advances in sequence-based prediction of MoRFs, disorder-to-order transitioning binding regions
    Song, Jiangning
    Kurgan, Lukasz
    EXPERT REVIEW OF PROTEOMICS, 2025, 22 (01) : 1 - 9
  • [3] Prediction of MoRFs based on sequence properties and convolutional neural networks
    He, Hao
    Zhou, Yatong
    Chi, Yue
    He, Jingfei
    BIODATA MINING, 2021, 14 (01)
  • [4] CoMemMoRFPred: Sequence-based Prediction of MemMoRFs by Combining Predictors of Intrinsic Disorder, MoRFs and Disordered Lipid-binding Regions
    Basu, Sushmita
    Hegedus, Tamas
    Kurgan, Lukasz
    JOURNAL OF MOLECULAR BIOLOGY, 2023, 435 (21)
  • [5] Prediction of MoRFs based on sequence properties and convolutional neural networks
    Hao He
    Yatong Zhou
    Yue Chi
    Jingfei He
    BioData Mining, 14
  • [6] HuMiTar: A sequence-based method for prediction of human microRNA targets
    Jishou Ruan
    Hanzhe Chen
    Lukasz Kurgan
    Ke Chen
    Chunsheng Kang
    Peiyu Pu
    Algorithms for Molecular Biology, 3
  • [7] Sequence-based prediction of protein interaction sites with an integrative method
    Chen, Xue-Wen
    Jeong, Jong Cheol
    BIOINFORMATICS, 2009, 25 (05) : 585 - 591
  • [8] HuMiTar: A sequence-based method for prediction of human microRNA targets
    Ruan, Jishou
    Chen, Hanzhe
    Kurgan, Lukasz
    Chen, Ke
    Kang, Chunsheng
    Pu, Peiyu
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2008, 3 (1)
  • [9] Computational prediction of MoRFs based on protein sequences and minimax probability machine
    Hao He
    Jiaxiang Zhao
    Guiling Sun
    BMC Bioinformatics, 20
  • [10] Computational prediction of MoRFs based on protein sequences and minimax probability machine
    He, Hao
    Zhao, Jiaxiang
    Sun, Guiling
    BMC BIOINFORMATICS, 2019, 20 (01)