Accurate Prediction of Human Essential Proteins Using Ensemble Deep Learning

被引:10
|
作者
Li, Yiming [1 ]
Zeng, Min [1 ]
Wu, Yifan [1 ]
Li, Yaohang [2 ]
Li, Min [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Hunan, Peoples R China
[2] Old Dominion Univ, Dept Comp Sci, Norfolk, VA 23529 USA
基金
中国国家自然科学基金;
关键词
Proteins; Feature extraction; Protein sequence; Biological information theory; Deep learning; Amino acids; Predictive models; essential protein prediction; ensemble learning; evolutionary information; PSSM; ESSENTIAL GENES; SUBCELLULAR-LOCALIZATION; IDENTIFICATION; LETHALITY; DATABASE;
D O I
10.1109/TCBB.2021.3122294
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Essential proteins are considered the foundation of life as they are indispensable for the survival of living organisms. Computational methods for essential protein discovery provide a fast way to identify essential proteins. But most of them heavily rely on various biological information, especially protein-protein interaction networks, which limits their practical applications. With the rapid development of high-throughput sequencing technology, sequencing data has become the most accessible biological data. However, using only protein sequence information to predict essential proteins has limited accuracy. In this paper, we propose EP-EDL, an ensemble deep learning model using only protein sequence information to predict human essential proteins. EP-EDL integrates multiple classifiers to alleviate the class imbalance problem and to improve prediction accuracy and robustness. In each base classifier, we employ multi-scale text convolutional neural networks to extract useful features from protein sequence feature matrices with evolutionary information. Our computational results show that EP-EDL outperforms the state-of-the-art sequence-based methods. Furthermore, EP-EDL provides a more practical and flexible way for biologists to accurately predict essential proteins. The source code and datasets can be downloaded from https://github.com/CSUBioGroup/EP-EDL.
引用
收藏
页码:3263 / 3271
页数:9
相关论文
共 50 条
  • [21] Deep ensemble learning for accurate retinal vessel segmentation
    Du, Lingling
    Liu, Hanruo
    Zhang, Lan
    Lu, Yao
    Li, Mengyao
    Hu, Yang
    Zhang, Yi
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 158
  • [22] Ensemble Machine Learning Framework for Accurate Flood Prediction
    Varghese, Akanksha
    Gupta, Vijay Baboo
    Saxena, Mayank
    10TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTING AND COMMUNICATION TECHNOLOGIES, CONECCT 2024, 2024,
  • [23] A deep learning ensemble for function prediction of hypothetical proteins from pathogenic bacterial species
    Mishra, Sarthak
    Rastogi, Yash Pratap
    Jabin, Suraiya
    Kaur, Punit
    Amir, Mohammad
    Khatun, Shabnam
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2019, 83
  • [24] Accurate prediction of somatic variants using deep learning model.
    Zhang, Peng
    Wang, Kai
    Yao, Ming
    Wang, Aodi
    Chen, Lijuan
    Liu, Angen
    Shi, Xiaoliang
    Zhang, Shiyue
    JOURNAL OF CLINICAL ONCOLOGY, 2020, 38 (15)
  • [25] DeepUEP: Prediction of Urine Excretory Proteins Using Deep Learning
    Du, Wei
    Pang, Ran
    Li, Gaoyang
    Cao, Huansheng
    Li, Ying
    Liang, Yanchun
    IEEE ACCESS, 2020, 8 : 100251 - 100261
  • [26] Stacked ensemble model for accurate crop yield prediction using machine learning techniques
    Ramesh, V
    Kumaresan, P.
    ENVIRONMENTAL RESEARCH COMMUNICATIONS, 2025, 7 (03):
  • [27] Accurate Dissolved Oxygen Prediction for Aquaculture Using Stacked Ensemble Machine Learning Model
    Kozhiparamban, Rasheed Abdul Haq
    Swetha, P.
    Harigovindan, V. P.
    NATIONAL ACADEMY SCIENCE LETTERS-INDIA, 2023, 46 (03): : 203 - 207
  • [28] Accurate Dissolved Oxygen Prediction for Aquaculture Using Stacked Ensemble Machine Learning Model
    Rasheed Abdul Haq Kozhiparamban
    P. Swetha
    V. P. Harigovindan
    National Academy Science Letters, 2023, 46 : 203 - 207
  • [29] An ensemble model for accurate prediction of key water quality parameters in river based on deep learning methods
    Zheng, Yue
    Wei, Jun
    Zhang, Wenming
    Zhang, Yiping
    Zhang, Tuqiao
    Zhou, Yongchao
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2024, 366
  • [30] An ensemble of deep learning architectures for accurate plant disease classification
    Ali, Ali Hussein
    Youssef, Ayman
    Abdelal, Mahmoud
    Raja, Muhammad Adil
    ECOLOGICAL INFORMATICS, 2024, 81