Prediction of Lysine Ubiquitylation with Ensemble Classifier and Feature Selection

被引:47
|
作者
Zhao, Xiaowei [1 ,2 ]
Li, Xiangtao [1 ,2 ]
Ma, Zhiqiang [1 ,2 ]
Yin, Minghao [2 ]
机构
[1] NE Normal Univ, Coll Life Sci, Changchun 130024, Peoples R China
[2] NE Normal Univ, Coll Comp Sci, Changchun 130117, Peoples R China
基金
中国国家自然科学基金;
关键词
ubiquitylation; ensemble classifier; support vector machine; lysine ubiquitylation sites; UBIQUITIN-LIKE PROTEINS; PROTEOMICS APPROACH; INTRINSIC DISORDER; IDENTIFICATION; RELEVANCE; LOCATION;
D O I
10.3390/ijms12128347
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Ubiquitylation is an important process of post-translational modification. Correct identification of protein lysine ubiquitylation sites is of fundamental importance to understand the molecular mechanism of lysine ubiquitylation in biological systems. This paper develops a novel computational method to effectively identify the lysine ubiquitylation sites based on the ensemble approach. In the proposed method, 468 ubiquitylation sites from 323 proteins retrieved from the Swiss-Prot database were encoded into feature vectors by using four kinds of protein sequences information. An effective feature selection method was then applied to extract informative feature subsets. After different feature subsets were obtained by setting different starting points in the search procedure, they were used to train multiple random forests classifiers and then aggregated into a consensus classifier by majority voting. Evaluated by jackknife tests and independent tests respectively, the accuracy of the proposed predictor reached 76.82% for the training dataset and 79.16% for the test dataset, indicating that this predictor is a useful tool to predict lysine ubiquitylation sites. Furthermore, site-specific feature analysis was performed and it was shown that ubiquitylation is intimately correlated with the features of its surrounding sites in addition to features derived from the lysine site itself. The feature selection method is available upon request.
引用
收藏
页码:8347 / 8361
页数:15
相关论文
共 50 条
  • [41] Ensemble Meta Classifier with Sampling and Feature Selection for Data with Multiclass Imbalance Problem
    Sainin, Mohd Shamrie
    Alfred, Rayner
    Ahmad, Faudziah
    JOURNAL OF INFORMATION AND COMMUNICATION TECHNOLOGY-MALAYSIA, 2021, 20 (02): : 103 - 133
  • [42] Multiobjective optimization for classifier ensemble and feature selection: an application to named entity recognition
    Asif Ekbal
    Sriparna Saha
    International Journal on Document Analysis and Recognition (IJDAR), 2012, 15 : 143 - 166
  • [43] Binary Coded Genetic Algorithm with Ensemble Classifier for feature selection in JPEG Steganalysis
    Sachnev, Vasily
    Kim, Hyoung Joong
    2014 IEEE NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT SENSORS, SENSOR NETWORKS AND INFORMATION PROCESSING (IEEE ISSNIP 2014), 2014,
  • [44] A Risk Prediction Model for Type 2 Diabetes Based on Weighted Feature Selection of Random Forest and XGBoost Ensemble Classifier
    Xu, Zhongxian
    Wang, Zhiliang
    2019 ELEVENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI 2019), 2019, : 278 - 283
  • [45] Binary Feature Selection Classifier Ensemble for Fault Diagnosis of Submersible Motor Pump
    Boldt, Francisco de Assis
    Rauber, Thomas Walter
    Oliveira-Santos, Thiago
    Rodrigues, Alexandre
    Varejao, Flavio M.
    Ribeiro, Marcos Pellegrini
    2017 IEEE 26TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2017, : 1807 - 1812
  • [46] Classification of Protein Sequences by Means of an Ensemble Classifier with an Improved Feature Selection Strategy
    Sriram, Aditya
    Sanapala, Mounica
    Patel, Ronak
    Patil, Nagamma
    RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 2, 2018, 708 : 167 - 174
  • [47] Combining feature selection, feature learning and ensemble learning for software fault prediction
    Hung Duy Tran
    Le Thi My Hanh
    Nguyen Thanh Binh
    PROCEEDINGS OF 2019 11TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2019), 2019, : 78 - 85
  • [48] Efficient prediction of evaporation using ensemble feature selection techniques
    Sharma, Rakhee
    Singh, Archana
    Mittal, Mamta
    MAUSAM, 2023, 74 (04): : 951 - 962
  • [49] Feature Selection and Software Defect Prediction by Different Ensemble Classifiers
    Shakhovska, Natalya
    Yakovyna, Vitaliy
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2021, PT I, 2021, 12923 : 307 - 313
  • [50] Improved Ensemble Feature Selection Based on DT for KPI Prediction
    Gao, Fulin
    Tan, Shuai
    Shi, Hongbo
    Tao, Yang
    Song, Bing
    IEEE ACCESS, 2021, 9 : 136861 - 136871