A computational approach for prediction of donor splice sites with improved accuracy

被引:7
|
作者
Meher, Prabina Kumar [1 ]
Sahu, Tanmaya Kumar [1 ]
Rao, A. R. [1 ]
Wahi, S. D. [1 ]
机构
[1] ICAR Indian Agr Stat Res Inst, New Delhi 110012, India
关键词
Machine learning; PreDOSS; Sequence encoding; Di-nucleotide dependency; Conditional error; FEATURES;
D O I
10.1016/j.jtbi.2016.06.013
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Identification of splice sites is important due to their key role in predicting the exon-intron structure of protein coding genes. Though several approaches have been developed for the prediction of splice sites, further improvement in the prediction accuracy will help predict gene structure more accurately. This paper presents a computational approach for prediction of donor splice sites with higher accuracy. In this approach, true and false splice sites were first encoded into numeric vectors and then used as input in artificial neural network (ANN), support vector machine (SVM) and random forest (RF) for prediction. ANN and SVM were found to perform equally and better than RF, while tested on HS3D and NN269 datasets. Further, the performance of ANN, SVM and RF were analyzed by using an independent test set of 50 genes and found that the prediction accuracy of ANN was higher than that of SVM and RF. All the predictors achieved higher accuracy while compared with the existing methods like NNsplice, MEM, MDD, WMM, MM1, FSPLICE, GeneID and ASSP, using the independent test set. We have also developed an online prediction server (PreDOSS) available at http://cabgrid.res.in:8080/predoss, for prediction of donor splice sites using the proposed approach. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:285 / 294
页数:10
相关论文
共 50 条
  • [41] Improved recognition of splice sites in A. thaliana by incorporating secondary structure information into sequence-derived features: a computational study
    Meher, Prabina Kumar
    Satpathy, Subhrajit
    3 BIOTECH, 2021, 11 (11)
  • [42] A high-performance approach for predicting donor splice sites based on short window size and imbalanced large samples
    Ying Zeng
    Hongjie Yuan
    Zheming Yuan
    Yuan Chen
    Biology Direct, 14
  • [43] Genome-wide activation of latent donor splice sites in stress and disease
    Nevo, Yuval
    Kamhi, Eyal
    Jacob-Hirsch, Jasmine
    Amariglio, Ninette
    Rechavi, Gideon
    Sperling, Joseph
    Sperling, Ruth
    NUCLEIC ACIDS RESEARCH, 2012, 40 (21) : 10980 - 10994
  • [44] Systematic Computational Identification of Variants That Activate Exonic and Intronic Cryptic Splice Sites
    Lee, Melissa
    Roos, Patrick
    Sharma, Neeraj
    Atalar, Melis
    Evans, Taylor A.
    Pellicore, Matthew J.
    Davis, Emily
    Lam, Anh-Thu N.
    Stanley, Susan E.
    Khalil, Sara E.
    Solomon, George M.
    Walker, Doug
    Raraigh, Karen S.
    Vecchio-Pagan, Briana
    Armanios, Mary
    Cutting, Garry R.
    AMERICAN JOURNAL OF HUMAN GENETICS, 2017, 100 (05) : 751 - 765
  • [45] EDeepSSP: Explainable deep neural networks for exact splice sites prediction
    Amilpur, Santhosh
    Bhukya, Raju
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2020, 18 (04)
  • [46] Boosted Categorical Restricted Boltzmann Machine for Computational Prediction of Splice Junctions
    Lee, Taehoon
    Yoon, Sungroh
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 2483 - 2492
  • [47] Evaluating the Accuracy of the QCEIMS Approach for Computational Prediction of Electron Ionization Mass Spectra of Purines and Pyrimidines
    Lee, Jesi
    Kind, Tobias
    Tantillo, Dean Joseph
    Wang, Lee-Ping
    Fiehn, Oliver
    METABOLITES, 2022, 12 (01)
  • [48] Ds transposon is biased towards providing splice donor sites for exonization in transgenic tobacco
    Huang, Kuo-Chan
    Yang, Hsiu-Chun
    Li, Kuan-Te
    Liu, Li-yu Daisy
    Charng, Yuh-Chyang
    PLANT MOLECULAR BIOLOGY, 2012, 79 (4-5) : 509 - 519
  • [49] Base substitution at different alternative splice donor sites of the tyrosinase gene in murine albinism
    LeFur, N
    Kelsall, SR
    Mintz, B
    GENOMICS, 1996, 37 (02) : 245 - 248
  • [50] Ds transposon is biased towards providing splice donor sites for exonization in transgenic tobacco
    Kuo-Chan Huang
    Hsiu-Chun Yang
    Kuan-Te Li
    Li-yu Daisy Liu
    Yuh-Chyang Charng
    Plant Molecular Biology, 2012, 79 : 509 - 519