A computational approach for prediction of donor splice sites with improved accuracy

被引:7
|
作者
Meher, Prabina Kumar [1 ]
Sahu, Tanmaya Kumar [1 ]
Rao, A. R. [1 ]
Wahi, S. D. [1 ]
机构
[1] ICAR Indian Agr Stat Res Inst, New Delhi 110012, India
关键词
Machine learning; PreDOSS; Sequence encoding; Di-nucleotide dependency; Conditional error; FEATURES;
D O I
10.1016/j.jtbi.2016.06.013
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Identification of splice sites is important due to their key role in predicting the exon-intron structure of protein coding genes. Though several approaches have been developed for the prediction of splice sites, further improvement in the prediction accuracy will help predict gene structure more accurately. This paper presents a computational approach for prediction of donor splice sites with higher accuracy. In this approach, true and false splice sites were first encoded into numeric vectors and then used as input in artificial neural network (ANN), support vector machine (SVM) and random forest (RF) for prediction. ANN and SVM were found to perform equally and better than RF, while tested on HS3D and NN269 datasets. Further, the performance of ANN, SVM and RF were analyzed by using an independent test set of 50 genes and found that the prediction accuracy of ANN was higher than that of SVM and RF. All the predictors achieved higher accuracy while compared with the existing methods like NNsplice, MEM, MDD, WMM, MM1, FSPLICE, GeneID and ASSP, using the independent test set. We have also developed an online prediction server (PreDOSS) available at http://cabgrid.res.in:8080/predoss, for prediction of donor splice sites using the proposed approach. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:285 / 294
页数:10
相关论文
共 50 条
  • [21] Performance evaluation of neural network, support vector machine and random forest for prediction of donor splice sites in rice
    Meher, Prabina Kumar
    Sahu, Tanmaya Kumar
    Rao, A. R.
    INDIAN JOURNAL OF GENETICS AND PLANT BREEDING, 2016, 76 (02) : 173 - 180
  • [22] Determination of window size and identification of suitable method for prediction of donor splice sites in rice (Oryza sativa) genome
    Meher, Prabina Kumar
    Sahu, Tanmaya Kumar
    Rao, A. R.
    Wahi, S. D.
    JOURNAL OF PLANT BIOCHEMISTRY AND BIOTECHNOLOGY, 2015, 24 (04) : 385 - 392
  • [23] A novel computational method for the identification of plant alternative splice sites
    Cui, Ying
    Han, Jiuqiang
    Zhong, Dexing
    Liu, Ruiling
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2013, 431 (02) : 221 - 224
  • [24] Splice sites prediction of Human genome using AdaBoost
    Pashaei, Elham
    Ozen, Mustafa
    Aydin, Nizamettin
    2016 3RD IEEE EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS, 2016, : 300 - 303
  • [25] GeneSplicer: a new computational method for splice site prediction
    Pertea, M
    Lin, XY
    Salzberg, SL
    NUCLEIC ACIDS RESEARCH, 2001, 29 (05) : 1185 - 1190
  • [26] The Prediction for Alternative and Constitutive Splice Sites in Human Genome
    Zhang Li-Rong
    Luo Liao-Fu
    Xing Yong-Qiang
    Jin Hong-Ying
    PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, 2008, 35 (10) : 1188 - 1194
  • [27] Hypophosphatemic rickets caused by a novel splice donor site mutation and activation of two cryptic splice donor sites in the PHEX gene
    Zou, Minjing
    Bulus, Derya
    Al-Rijjal, Roua A.
    Andiran, Nesibe
    BinEssa, Huda
    Kattan, Walaa E.
    Meyer, Brian
    Shi, Yufei
    JOURNAL OF PEDIATRIC ENDOCRINOLOGY & METABOLISM, 2015, 28 (1-2): : 211 - 216
  • [28] Predicting sulfotyrosine sites using the random forest algorithm with significantly improved prediction accuracy
    Zheng Rong Yang
    BMC Bioinformatics, 10
  • [29] Predicting sulfotyrosine sites using the random forest algorithm with significantly improved prediction accuracy
    Yang, Zheng Rong
    BMC BIOINFORMATICS, 2009, 10 : 361
  • [30] Collaborative Filtering with Improved Item Prediction Approach for Enhancing the Accuracy of Recommendation
    Duan Long-zhen
    Wang Gui-fen
    Ren Yan
    2012 FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION NETWORKING AND SECURITY (MINES 2012), 2012, : 349 - 352