MulStack: An ensemble learning prediction model of multilabel mRNA subcellular localization

被引:2
|
作者
Liu Z. [1 ]
Bai T. [2 ,3 ,4 ]
Liu B. [3 ,4 ]
Yu L. [1 ]
机构
[1] School of Computer Science and Technology, Xidian University, Xian
[2] School of Mathematics & Computer Science, Yan'an University, Shaanxi
[3] School of Computer Science and Technology, Beijing Institute of Technology, Beijing
[4] Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing
基金
中国国家自然科学基金;
关键词
Deep learning; Ensemble learning predictor; mRNA features at two levels; Multilabel mRNA subcellular localization; Position encoding;
D O I
10.1016/j.compbiomed.2024.108289
中图分类号
学科分类号
摘要
Subcellular localization of mRNA is related to protein synthesis, cell polarity, cell movement and other biological regulation mechanisms. The distribution of mRNAs in subcellulars is similar to that of proteins, and most mRNAs are distributed in multiple subcellulars. Recently, some computational methods have been designed to predict the subcellular localization of mRNA. However, these methods only employed a sin-gle level of mRNA features and did not employ the position encoding of nucleotides in mRNA. In this paper, an ensemble learning prediction model is proposed, named MulStack, which is based on random forest and deep learning for multilabel mRNA subcellular localization. The proposed method employs two levels of mRNA features, including sequence-level and residue-level features, and position encoding is employed for the first time in the field of subcellular localization of mRNA. Random forest is employed to learn mRNA sequence-level feature, deep learning is employed to learn mRNA sequence-level feature and mRNA residue-level combined with position encoding. And the outputs of random forest and deep learning model will be weighted sum as the prediction probability. Compared with existing methods, the results show that MulStack is the best in the localization of the nucleus, cytosol and exosome. In addition, position weight matrices (PWMs) are extracted by convolutional neural networks (CNNs) that can be matched with known RNA binding protein motifs. Gene ontology (GO) enrichment analysis shows biological processes, molecular functions and cellular components of mRNA genes. The prediction web server of MulStack is freely accessible at http://bliulab.net/MulStack. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [31] ncRNALocate-EL: a multi-label ncRNA subcellular locality prediction model based on ensemble learning
    Bai, Tao
    Liu, Bin
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2023, 22 (05) : 442 - 452
  • [32] Prediction of LncRNA Subcellular Localization with Deep Learning from Sequence Features
    Gudenas, Brian L.
    Wang, Liangjiang
    SCIENTIFIC REPORTS, 2018, 8
  • [33] Extreme Learning Machine Based Bacterial Protein Subcellular Localization Prediction
    Lan, Yuan
    Soh, Yeng Chai
    Huang, Guang-Bin
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 1859 - 1863
  • [34] Prediction of LncRNA Subcellular Localization with Deep Learning from Sequence Features
    Brian L. Gudenas
    Liangjiang Wang
    Scientific Reports, 8
  • [35] Prediction of RNA subcellular localization: Learning from heterogeneous data sources
    Savulescu, Anca Flavia
    Bouilhol, Emmanuel
    Beaume, Nicolas
    Nikolski, Macha
    ISCIENCE, 2021, 24 (11)
  • [36] An Ensemble Learning Model for Agricultural Irrigation Prediction
    Chen, Yan-An
    Hsieh, Wen-Hao
    Ko, Yu-Shuo
    Huang, Nen-Fu
    35TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2021), 2021, : 311 - 316
  • [37] Enhancing subcellular localization prediction of apoptosis proteins by ensemble SVMs with random under-sampling
    Wang X.
    Li X.
    Li H.
    Tao H.
    Wang R.
    Meng Y.
    Wang, Xiao (pandaxiaoxi@163.com), 1635, Totem Publishers Ltd (14) : 1635 - 1640
  • [38] Subcellular localization of expansin mRNA in xylem cells
    Im, KH
    Cosgrove, DJ
    Jones, AM
    PLANT PHYSIOLOGY, 2000, 123 (02) : 463 - 470
  • [39] SGCL-LncLoc: An Interpretable Deep Learning Model for Improving IncRNA Subcellular Localization Prediction with Supervised Graph Contrastive Learning
    Li, Min
    Zhao, Baoying
    Li, Yiming
    Ding, Pingjian
    Yin, Rui
    Kan, Shichao
    Zeng, Min
    BIG DATA MINING AND ANALYTICS, 2024, 7 (03): : 765 - 780
  • [40] EDCLoc: a prediction model for mRNA subcellular localization using improved focal loss to address multi-label class imbalance
    Deng, Yu
    Jia, Jianhua
    Yi, Mengyue
    BMC GENOMICS, 2024, 25 (01):