MulStack: An ensemble learning prediction model of multilabel mRNA subcellular localization

被引:2
|
作者
Liu Z. [1 ]
Bai T. [2 ,3 ,4 ]
Liu B. [3 ,4 ]
Yu L. [1 ]
机构
[1] School of Computer Science and Technology, Xidian University, Xian
[2] School of Mathematics & Computer Science, Yan'an University, Shaanxi
[3] School of Computer Science and Technology, Beijing Institute of Technology, Beijing
[4] Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing
基金
中国国家自然科学基金;
关键词
Deep learning; Ensemble learning predictor; mRNA features at two levels; Multilabel mRNA subcellular localization; Position encoding;
D O I
10.1016/j.compbiomed.2024.108289
中图分类号
学科分类号
摘要
Subcellular localization of mRNA is related to protein synthesis, cell polarity, cell movement and other biological regulation mechanisms. The distribution of mRNAs in subcellulars is similar to that of proteins, and most mRNAs are distributed in multiple subcellulars. Recently, some computational methods have been designed to predict the subcellular localization of mRNA. However, these methods only employed a sin-gle level of mRNA features and did not employ the position encoding of nucleotides in mRNA. In this paper, an ensemble learning prediction model is proposed, named MulStack, which is based on random forest and deep learning for multilabel mRNA subcellular localization. The proposed method employs two levels of mRNA features, including sequence-level and residue-level features, and position encoding is employed for the first time in the field of subcellular localization of mRNA. Random forest is employed to learn mRNA sequence-level feature, deep learning is employed to learn mRNA sequence-level feature and mRNA residue-level combined with position encoding. And the outputs of random forest and deep learning model will be weighted sum as the prediction probability. Compared with existing methods, the results show that MulStack is the best in the localization of the nucleus, cytosol and exosome. In addition, position weight matrices (PWMs) are extracted by convolutional neural networks (CNNs) that can be matched with known RNA binding protein motifs. Gene ontology (GO) enrichment analysis shows biological processes, molecular functions and cellular components of mRNA genes. The prediction web server of MulStack is freely accessible at http://bliulab.net/MulStack. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [41] Prediction of MicroRNA Subcellular Localization by Using a Sequence-to-Sequence Model
    Xiao, Yiqun
    Cai, Jiaxun
    Yang, Yang
    Zhao, Hai
    Shen, Hong-Bin
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1332 - 1337
  • [42] Protein Subcellular Localization Prediction Model Based on Graph Convolutional Network
    Zhang, Tianhao
    Gu, Jiawei
    Wang, Zeyu
    Wu, Chunguo
    Liang, Yanchun
    Shi, Xiaohu
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2022, 14 (04) : 937 - 946
  • [43] Protein Subcellular Localization Prediction Model Based on Graph Convolutional Network
    Tianhao Zhang
    Jiawei Gu
    Zeyu Wang
    Chunguo Wu
    Yanchun Liang
    Xiaohu Shi
    Interdisciplinary Sciences: Computational Life Sciences, 2022, 14 : 937 - 946
  • [44] ImPLoc: a multi-instance deep learning model for the prediction of protein subcellular localization based on immunohistochemistry images
    Long, Wei
    Yang, Yang
    Shen, Hong-Bin
    BIOINFORMATICS, 2020, 36 (07) : 2244 - 2250
  • [45] Protein subcellular localization prediction tools
    Gillani, Maryam
    Pollastri, Gianluca
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2024, 23 : 1796 - 1807
  • [46] Review of Protein Subcellular Localization Prediction
    Wang, Zhen
    Zou, Quan
    Jiang, Yi
    Ju, Ying
    Zeng, Xiangxiang
    CURRENT BIOINFORMATICS, 2014, 9 (03) : 331 - 342
  • [47] PSLCNN: Protein Subcellular Localization Prediction for Eukaryotes and Prokaryotes Using Deep Learning
    Chang, Che-Yu
    Hsu, Tz-Wei
    Chang, Jia-Ming
    2019 INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2019,
  • [48] Prediction of subcellular localization of proteins using machine learning techniques and evolutionary information
    Raghava, G. P. S.
    AMINO ACIDS, 2007, 33 (03) : X - XI
  • [49] An ensemble-learning model for failure rate prediction
    Marcello, Braglia
    Davide, Castellano
    Marco, Frosolini
    Roberto, Gabbrielli
    Leonardo, Marrazzini
    Luca, Padellini
    INTERNATIONAL CONFERENCE ON INDUSTRY 4.0 AND SMART MANUFACTURING (ISM 2019), 2020, 42 : 41 - 48
  • [50] A Stacking Ensemble Learning Model for Mobile Traffic Prediction
    Li, Zhigang
    Cai, Di
    Wang, Jialin
    Fu, Jingchang
    Qin, Linlin
    Fu, Duomin
    2020 IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA (ICCC), 2020, : 542 - 547