Combining Sequence and Epigenomic Data to Predict Transcription Factor Binding Sites Using Deep Learning

被引:1
|
作者
Jing, Fang [1 ]
Zhang, Shao-Wu [1 ]
Cao, Zhen [2 ]
Zhang, Shihua l [2 ,3 ]
机构
[1] Northwestern Polytech Univ, Coll Automat, Minist Educ, Key Lab Informat Fusion Technol, Xian 710072, Shaanxi, Peoples R China
[2] Chinese Acad Sci, Acad Math & Syst Sci, NCMIS, CEMS,RCSDS, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Math Sci, Beijing 100049, Peoples R China
基金
中国国家自然科学基金;
关键词
Bioinformatics; Machine learning; Transcription factors binding sites; Convolutional neural networks; DNA accessibility; Histone modification; CHROMATIN ACCESSIBILITY PREDICTION; NETWORKS;
D O I
10.1007/978-3-319-94968-0_23
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowing the transcription factor binding sites (TFBSs) is essential for modeling the underlying binding mechanisms and follow-up cellular functions. Convolutional neural networks (CNNs) have outperformed methods in predicting TFBSs from the primary DNA sequence. In addition to DNA sequences, histone modifications and chromatin accessibility are also important factors influencing their activity. They have been explored to predict TFBSs recently. However, current methods rarely take into account histone modifications and chromatin accessibility using CNN in an integrative framework. To this end, we developed a general CNN model to integrate these data for predicting TFBSs. We systematically benchmarked a series of architecture variants by changing network structure in terms of width and depth, and explored the effects of sample length at flanking regions. We evaluated the performance of the three types of data and their combinations using 256 ChIP-seq experiments and also compared it with competing machine learning methods. We find that contributions from these three types of data are complementary to each other. Moreover, the integrative CNN framework is superior to traditional machine learning methods with significant improvements.
引用
收藏
页码:241 / 252
页数:12
相关论文
共 50 条
  • [21] Prediction of Transcription Factor Binding Sites on Cell-Free DNA Based on Deep Learning
    Qi, Ting
    Zhou, Ying
    Sheng, Yuqi
    Li, Zhihui
    Yang, Yuwei
    Liu, Quanjun
    Ge, Qinyu
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (10) : 4002 - 4008
  • [22] Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture
    Wang, Siguo
    Zhang, Qinhu
    Shen, Zhen
    He, Ying
    Chen, Zhen-Heng
    Li, Jianqiang
    Huang, De-Shuang
    MOLECULAR THERAPY NUCLEIC ACIDS, 2021, 24 : 154 - 163
  • [23] Genomic DNA sequence and transcription factor binding sites of mouse Ninjurin
    Moon, AR
    Oh, GT
    Kim, JW
    Choi, YJ
    Choe, IS
    DNA SEQUENCE, 2001, 12 (5-6): : 385 - 395
  • [24] Predicting in-vitro Transcription Factor Binding Sites Using DNA Sequence plus Shape
    Zhang, Qinhu
    Shen, Zhen
    Huang, De-Shuang
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (02) : 667 - 676
  • [25] Prediction of the transcription factor binding sites with meta-learning
    Jing, Fang
    Zhang, Shao-Wu
    Zhang, Shihua
    METHODS, 2022, 203 : 207 - 213
  • [26] On the detection and refinement of transcription factor binding sites using ChIP-Seq data
    Hu, Ming
    Yu, Jindan
    Taylor, Jeremy M. G.
    Chinnaiyan, Arul M.
    Qin, Zhaohui S.
    NUCLEIC ACIDS RESEARCH, 2010, 38 (07) : 2154 - 2167
  • [27] Computational prediction of transcription factor binding sites based on an integrative approach incorporating genomic and epigenomic features
    Ho-Sik Seok
    Jaebum Kim
    Genes & Genomics, 2014, 36 : 25 - 30
  • [28] Computational prediction of transcription factor binding sites based on an integrative approach incorporating genomic and epigenomic features
    Seok, Ho-Sik
    Kim, Jaebum
    GENES & GENOMICS, 2014, 36 (01) : 25 - 30
  • [29] Identification of a non-canonical transcription factor binding site using deep learning
    Proft, Sebastian
    Leiz, Janna
    Opitz, Robert
    Jung, Minie
    Heinemann, Udo
    Seelow, Dominik
    Schmidt-Ott, Kai
    Rutkiewicz, Maria
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2023, 31 : 620 - 621
  • [30] Imputation for transcription factor binding predictions based on deep learning
    Qin, Qian
    Feng, Jianxing
    PLOS COMPUTATIONAL BIOLOGY, 2017, 13 (02)