Accurate Prediction of Human Essential Proteins Using Ensemble Deep Learning

被引:10
|
作者
Li, Yiming [1 ]
Zeng, Min [1 ]
Wu, Yifan [1 ]
Li, Yaohang [2 ]
Li, Min [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Hunan, Peoples R China
[2] Old Dominion Univ, Dept Comp Sci, Norfolk, VA 23529 USA
基金
中国国家自然科学基金;
关键词
Proteins; Feature extraction; Protein sequence; Biological information theory; Deep learning; Amino acids; Predictive models; essential protein prediction; ensemble learning; evolutionary information; PSSM; ESSENTIAL GENES; SUBCELLULAR-LOCALIZATION; IDENTIFICATION; LETHALITY; DATABASE;
D O I
10.1109/TCBB.2021.3122294
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Essential proteins are considered the foundation of life as they are indispensable for the survival of living organisms. Computational methods for essential protein discovery provide a fast way to identify essential proteins. But most of them heavily rely on various biological information, especially protein-protein interaction networks, which limits their practical applications. With the rapid development of high-throughput sequencing technology, sequencing data has become the most accessible biological data. However, using only protein sequence information to predict essential proteins has limited accuracy. In this paper, we propose EP-EDL, an ensemble deep learning model using only protein sequence information to predict human essential proteins. EP-EDL integrates multiple classifiers to alleviate the class imbalance problem and to improve prediction accuracy and robustness. In each base classifier, we employ multi-scale text convolutional neural networks to extract useful features from protein sequence feature matrices with evolutionary information. Our computational results show that EP-EDL outperforms the state-of-the-art sequence-based methods. Furthermore, EP-EDL provides a more practical and flexible way for biologists to accurately predict essential proteins. The source code and datasets can be downloaded from https://github.com/CSUBioGroup/EP-EDL.
引用
收藏
页码:3263 / 3271
页数:9
相关论文
共 50 条
  • [41] Ensemble learning prediction of protein-protein interactions using proteins functional annotations
    Saha, Indrajit
    Zubek, Julian
    Klingstrom, Tomas
    Forsberg, Simon
    Wikander, Johan
    Kierczak, Marcin
    Maulik, Ujjwal
    Plewczynski, Dariusz
    MOLECULAR BIOSYSTEMS, 2014, 10 (04) : 820 - 830
  • [42] Computational prediction and interpretation of druggable proteins using a stacked ensemble-learning framework
    Charoenkwan, Phasit
    Schaduangrat, Nalini
    Lio, Pietro
    Moni, Mohammad Ali
    Shoombuatong, Watshara
    Manavalan, Balachandran
    ISCIENCE, 2022, 25 (09)
  • [43] DeepSplice: a deep learning approach for accurate prediction of alternative splicing events in the human genome
    Abrar, Mohammad
    Hussain, Didar
    Khan, Izaz Ahmad
    Ullah, Fasee
    Haq, Mohd Anul
    Aleisa, Mohammed A.
    Alenizi, Abdullah
    Bhushan, Shashi
    Martha, Sheshikala
    FRONTIERS IN GENETICS, 2024, 15
  • [44] A deep learning based ensemble learning method for epileptic seizure prediction
    Usman, Syed Muhammad
    Khalid, Shehzad
    Bashir, Sadaf
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 136
  • [45] Prediction of the critical temperature of superconducting materials using image regression and ensemble deep learning
    Taheri, AmirMasoud
    Ebrahimnezhad, Hossein
    Sedaaghi, Mohammad Hossein
    MATERIALS TODAY COMMUNICATIONS, 2022, 33
  • [46] Sarcasm detection using deep learning and ensemble learning
    Priya Goel
    Rachna Jain
    Anand Nayyar
    Shruti Singhal
    Muskan Srivastava
    Multimedia Tools and Applications, 2022, 81 : 43229 - 43252
  • [47] Advancements in Gaze Coordinate Prediction Using Deep Learning: A Novel Ensemble Loss Approach
    Kim, Seunghyun
    Lee, Seungkeon
    Lee, Eui Chul
    APPLIED SCIENCES-BASEL, 2024, 14 (12):
  • [48] Credit Card Fraud Prediction and Classification using Deep Neural Network and Ensemble Learning
    Khan, Fairoz Nower
    Khan, Amit Hasan
    Israt, Lamiah
    2020 IEEE REGION 10 SYMPOSIUM (TENSYMP) - TECHNOLOGY FOR IMPACTFUL SUSTAINABLE DEVELOPMENT, 2020, : 114 - 119
  • [49] El Nino Index Prediction Using Deep Learning with Ensemble Empirical Mode Decomposition
    Guo, Yanan
    Cao, Xiaoqun
    Liu, Bainian
    Peng, Kecheng
    SYMMETRY-BASEL, 2020, 12 (06):
  • [50] Deep Learning Ensemble Model for the Prediction of Traffic Accidents Using Social Media Data
    Gutierrez-Osorio, Camilo
    Gonzalez, Fabio A.
    Augusto Pedraza, Cesar
    COMPUTERS, 2022, 11 (09)