An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data

被引:4
|
作者
Lapolice, Troy M. [1 ,2 ,3 ]
Huang, Yi-Fei [1 ,3 ]
机构
[1] Penn State Univ, Dept Biol, University Pk, PA 16802 USA
[2] Penn State Univ, Bioinformat & Genom Grad Program, University Pk, PA 16802 USA
[3] Penn State Univ, Huck Inst Life Sci, University Pk, PA 16802 USA
关键词
Deep Learning; Unsupervised; Essential Genes; Loss of Function Intolerance; Population Genomics; Functional Genomics; VARIANTS; ETIOLOGY;
D O I
10.1186/s12859-023-05481-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundThe ability to accurately predict essential genes intolerant to loss-of-function (LOF) mutations can dramatically improve the identification of disease-associated genes. Recently, there have been numerous computational methods developed to predict human essential genes from population genomic data. While the existing methods are highly predictive of essential genes of long length, they have limited power in pinpointing short essential genes due to the sparsity of polymorphisms in the human genome.ResultsMotivated by the premise that population and functional genomic data may provide complementary evidence for gene essentiality, here we present an evolution-based deep learning model, DeepLOF, to predict essential genes in an unsupervised manner. Unlike previous population genetic methods, DeepLOF utilizes a novel deep learning framework to integrate both population and functional genomic data, allowing us to pinpoint short essential genes that can hardly be predicted from population genomic data alone. Compared with previous methods, DeepLOF shows unmatched performance in predicting ClinGen haploinsufficient genes, mouse essential genes, and essential genes in human cell lines. Notably, at a false positive rate of 5%, DeepLOF detects 50% more ClinGen haploinsufficient genes than previous methods. Furthermore, DeepLOF discovers 109 novel essential genes that are too short to be identified by previous methods.ConclusionThe predictive power of DeepLOF shows that it is a compelling computational method to aid in the discovery of essential genes.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] DeepHBSP: A Deep Learning Framework for Predicting Human Blood-Secretory Proteins Using Transfer Learning
    Wei Du
    Yu Sun
    Hui-Min Bao
    Liang Chen
    Ying Li
    Yan-Chun Liang
    Journal of Computer Science and Technology, 2021, 36 : 234 - 247
  • [22] DeepHBSP: A Deep Learning Framework for Predicting Human Blood-Secretory Proteins Using Transfer Learning
    Du, Wei
    Sun, Yu
    Bao, Hui-Min
    Chen, Liang
    Li, Ying
    Liang, Yan-Chun
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2021, 36 (02) : 234 - 247
  • [23] From shallow to deep: some lessons learned from application of machine learning for recognition of functional genomic elements in human genome
    Jankovic, Boris
    Gojobori, Takashi
    HUMAN GENOMICS, 2022, 16 (01)
  • [24] From shallow to deep: some lessons learned from application of machine learning for recognition of functional genomic elements in human genome
    Boris Jankovic
    Takashi Gojobori
    Human Genomics, 16
  • [25] Predicting composite microstructure from deformation data using deep learning
    Gu, Aijun
    Sang, Sheng
    AIP ADVANCES, 2024, 14 (07)
  • [26] Prediction of regulatory motifs from human Chip-sequencing data using a deep learning framework
    Yang, Jinyu
    Ma, Anjun
    Hoppe, Adam D.
    Wang, Cankun
    Li, Yang
    Zhang, Chi
    Wang, Yan
    Liu, Bingqiang
    Ma, Qin
    NUCLEIC ACIDS RESEARCH, 2019, 47 (15) : 7809 - 7824
  • [27] Deep Online Hierarchical Unsupervised Learning for Pattern Mining from Utility Usage Data
    Mohamad, Saad
    Arifoglu, Damla
    Mansouri, Chemseddine
    Bouchachia, Abdelhamid
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS (UKCI), 2019, 840 : 276 - 290
  • [28] Calibration Tool for Genomic Aggregates (CTGA): A deep learning framework for calibrating somatic mutation profiling data from conventional gene panel data
    Anaya, Jordan
    Cummings, Craig
    Lee, Jocelyn
    Baras, Alexander
    CANCER RESEARCH, 2020, 80 (16)
  • [29] A deep learning approach to prediction of blood group antigens from genomic data
    Moslemi, Camous
    Saekmose, Susanne
    Larsen, Rune
    Brodersen, Thorsten
    Bay, Jakob T.
    Didriksen, Maria
    Nielsen, Kaspar R.
    Bruun, Mie T.
    Dowsett, Joseph
    Dinh, Khoa M.
    Mikkelsen, Christina
    Hyvarinen, Kati
    Ritari, Jarmo
    Partanen, Jukka
    Ullum, Henrik
    Erikstrup, Christian
    Ostrowski, Sisse R.
    Olsson, Martin L.
    Pedersen, Ole B.
    TRANSFUSION, 2024, 64 (11) : 2179 - 2195
  • [30] Learning Human Activity From Visual Data Using Deep Learning
    Alhersh, Taha
    Stuckenschmidt, Heiner
    Rehman, Atiq Ur
    Belhaouari, Samir Brahim
    IEEE ACCESS, 2021, 9 : 106245 - 106253