An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data

被引:4
|
作者
Lapolice, Troy M. [1 ,2 ,3 ]
Huang, Yi-Fei [1 ,3 ]
机构
[1] Penn State Univ, Dept Biol, University Pk, PA 16802 USA
[2] Penn State Univ, Bioinformat & Genom Grad Program, University Pk, PA 16802 USA
[3] Penn State Univ, Huck Inst Life Sci, University Pk, PA 16802 USA
关键词
Deep Learning; Unsupervised; Essential Genes; Loss of Function Intolerance; Population Genomics; Functional Genomics; VARIANTS; ETIOLOGY;
D O I
10.1186/s12859-023-05481-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundThe ability to accurately predict essential genes intolerant to loss-of-function (LOF) mutations can dramatically improve the identification of disease-associated genes. Recently, there have been numerous computational methods developed to predict human essential genes from population genomic data. While the existing methods are highly predictive of essential genes of long length, they have limited power in pinpointing short essential genes due to the sparsity of polymorphisms in the human genome.ResultsMotivated by the premise that population and functional genomic data may provide complementary evidence for gene essentiality, here we present an evolution-based deep learning model, DeepLOF, to predict essential genes in an unsupervised manner. Unlike previous population genetic methods, DeepLOF utilizes a novel deep learning framework to integrate both population and functional genomic data, allowing us to pinpoint short essential genes that can hardly be predicted from population genomic data alone. Compared with previous methods, DeepLOF shows unmatched performance in predicting ClinGen haploinsufficient genes, mouse essential genes, and essential genes in human cell lines. Notably, at a false positive rate of 5%, DeepLOF detects 50% more ClinGen haploinsufficient genes than previous methods. Furthermore, DeepLOF discovers 109 novel essential genes that are too short to be identified by previous methods.ConclusionThe predictive power of DeepLOF shows that it is a compelling computational method to aid in the discovery of essential genes.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Predicting Animal Behavior from Neuronal Miniscope Data: A Deep Learning Approach
    Das, Subhrajit
    Senarathna, Janaka
    Ouyang, Emma
    Banerjee, Amit
    Pathak, Arvind P.
    MEDICAL IMAGING 2024: IMAGE PROCESSING, 2024, 12926
  • [42] A Deep Learning Approach for Predicting Spatiotemporal Dynamics From Sparsely Observed Data
    Saha, Priyabrata
    Mukhopadhyay, Saibal
    IEEE ACCESS, 2021, 9 : 64200 - 64210
  • [43] Predicting Vegetation Stratum Occupancy from Airborne LiDAR Data with Deep Learning
    Kalinicheva, Ekaterina
    Landrieu, Loic
    Mallet, Clement
    Chehata, Nesrine
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 112
  • [44] Deep online hierarchical dynamic unsupervised learning for pattern mining from utility usage data
    Mohamad, Saad
    Bouchachia, Abdelhamid
    NEUROCOMPUTING, 2020, 390 : 359 - 373
  • [45] RAPID MAPPING OF LANDSLIDES FROM SENTINEL-2 DATA USING UNSUPERVISED DEEP LEARNING
    Shahabi, H.
    Rahimzad, M.
    Ghorbanzadeh, O.
    Piralilou, S. T.
    Blaschke, T.
    Homayouni, S.
    Ghamisi, P.
    2022 IEEE MEDITERRANEAN AND MIDDLE-EAST GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (M2GARSS), 2022, : 17 - 20
  • [46] Multi-modal deep learning of functional and structural neuroimaging and genomic data to predict mental illness
    Rahaman, Md Abdur
    Chen, Jiayu
    Fu, Zening
    Lewis, Noah
    Iraji, Armin
    Calhoun, Vince D.
    2021 43RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY (EMBC), 2021, : 3267 - 3272
  • [47] Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data
    Huang, Yi-Fei
    Gulko, Brad
    Siepel, Adam
    NATURE GENETICS, 2017, 49 (04) : 618 - +
  • [48] Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data
    Yi-Fei Huang
    Brad Gulko
    Adam Siepel
    Nature Genetics, 2017, 49 : 618 - 624
  • [49] TabDEG: Classifying differentially expressed genes from RNA-seq data based on feature extraction and deep learning framework
    Feng, Sifan
    Wang, Zhenyou
    Jin, Yinghua
    Xu, Shengbin
    PLOS ONE, 2024, 19 (07):
  • [50] Gene expression clock: an unsupervised deep learning approach for predicting circadian rhythmicity from whole genome expression
    Ansary Ogholbake, Aram
    Cheng, Qiang
    Neural Computing and Applications, 2024, 36 (33) : 20653 - 20670