MAPS: An integrated system for protein sequence annotation using support vector machine

被引:0
|
作者
Wang, Jung-Ying [1 ,3 ]
Liu, Cheng-Kang [1 ]
Lee, Hahn-Ming [1 ,2 ]
机构
[1] Natl Taiwan Univ Sci & Technol, Dept Comp Sci & Informat Engn, Taipei 106, Taiwan
[2] Acad Sinica, Inst Informat Sci, Taipei 115, Taiwan
[3] Lunghwa Univ Sci & Technol, Dept Multimedia & Game Sci, Tao Yuan 333, Taiwan
关键词
protein annotation; support vector machine; sequence similarity; gene ontology (GO);
D O I
10.1080/02533839.2008.9671432
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
An Integrated environment for biological data is valuable to the function annotation of protein sequences. In this paper, we present a protein sequence annotation system, named MAPS (Multiple Annotation for Protein Sequences), which provides a mechanism to extract multiple annotations from various types of biological data including SwissProt keywords, InterPro signatures and GO terms. Furthermore, MAPS can automatically eliminate the annotation errors generated by a pre-trained SVM classifier. It assigns an annotation to the protein sequence at question by considering not only a single similar protein but also all similar proteins with the annotation. In other words, we take account of the evolutionary information of the protein of interest to reduce the error annotations inferred from weak sequence similarities and from sequence identities in non-functional segments. The experimental results show that the error annotations can be eliminated effectively while keeping high accuracy for different types of annotations.
引用
收藏
页码:781 / 790
页数:10
相关论文
共 50 条
  • [1] Inferring Protein Interactions from Sequence using Support Vector Machine
    Shi, Ming-Guang
    Wu, Min
    Huang, De-Shuang
    Li, Xue-Ling
    IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 568 - +
  • [2] Prediction of catalytic residues using Support Vector Machine with selected protein sequence and structural properties
    Petrova, Natalia V.
    Wu, Cathy H.
    BMC BIOINFORMATICS, 2006, 7 (1)
  • [3] Prediction of catalytic residues using Support Vector Machine with selected protein sequence and structural properties
    Natalia V Petrova
    Cathy H Wu
    BMC Bioinformatics, 7
  • [4] Prediction of Protein Structural Class Using a Combined Representation of Protein-sequence Information and Support Vector Machine
    Wu, Li
    Dai, Qi
    Han, Bin
    Zhu, Lei
    Li, Lihua
    2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2010, : 101 - 106
  • [5] A patent quality analysis and classification system using self-organizing maps with support vector machine
    Wu, Jheng-Long
    Chang, Pei-Chann
    Tsao, Cheng-Chin
    Fan, Chin-Yuan
    APPLIED SOFT COMPUTING, 2016, 41 : 305 - 316
  • [6] Power Quality Analysis For A Solar-Grid Integrated System Using Support Vector Machine
    Moloi, K.
    Thango, B. A.
    Nnnachi, A. F.
    Jordaan, J. A.
    Hamam, Y.
    2021 SOUTHERN AFRICAN UNIVERSITIES POWER ENGINEERING CONFERENCE/ROBOTICS AND MECHATRONICS/PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA (SAUPEC/ROBMECH/PRASA), 2021,
  • [7] Sequence-Based Prediction of Protein-Peptide Binding Sites Using Support Vector Machine
    Taherzadeh, Ghazaleh
    Yang, Yuedong
    Zhang, Tuo
    Liew, Alan Wee-Chung
    Zhou, Yaoqi
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 2016, 37 (13) : 1223 - 1229
  • [8] Identification of catalytic residues from protein structure using support vector machine with sequence and structural features
    Pugalenthi, Ganesan
    Kumar, K. Krishna
    Suganthan, P. N.
    Gangal, Rajeev
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2008, 367 (03) : 630 - 634
  • [9] CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine
    Kong, Lei
    Zhang, Yong
    Ye, Zhi-Qiang
    Liu, Xiao-Qiao
    Zhao, Shu-Qi
    Wei, Liping
    Gao, Ge
    NUCLEIC ACIDS RESEARCH, 2007, 35 : W345 - W349
  • [10] Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure
    Lewis, Darrin P.
    Jebara, Tony
    Noble, William Stafford
    BIOINFORMATICS, 2006, 22 (22) : 2753 - 2760