Combining Sequence and Epigenomic Data to Predict Transcription Factor Binding Sites Using Deep Learning

被引:1
|
作者
Jing, Fang [1 ]
Zhang, Shao-Wu [1 ]
Cao, Zhen [2 ]
Zhang, Shihua l [2 ,3 ]
机构
[1] Northwestern Polytech Univ, Coll Automat, Minist Educ, Key Lab Informat Fusion Technol, Xian 710072, Shaanxi, Peoples R China
[2] Chinese Acad Sci, Acad Math & Syst Sci, NCMIS, CEMS,RCSDS, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Math Sci, Beijing 100049, Peoples R China
基金
中国国家自然科学基金;
关键词
Bioinformatics; Machine learning; Transcription factors binding sites; Convolutional neural networks; DNA accessibility; Histone modification; CHROMATIN ACCESSIBILITY PREDICTION; NETWORKS;
D O I
10.1007/978-3-319-94968-0_23
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowing the transcription factor binding sites (TFBSs) is essential for modeling the underlying binding mechanisms and follow-up cellular functions. Convolutional neural networks (CNNs) have outperformed methods in predicting TFBSs from the primary DNA sequence. In addition to DNA sequences, histone modifications and chromatin accessibility are also important factors influencing their activity. They have been explored to predict TFBSs recently. However, current methods rarely take into account histone modifications and chromatin accessibility using CNN in an integrative framework. To this end, we developed a general CNN model to integrate these data for predicting TFBSs. We systematically benchmarked a series of architecture variants by changing network structure in terms of width and depth, and explored the effects of sample length at flanking regions. We evaluated the performance of the three types of data and their combinations using 256 ChIP-seq experiments and also compared it with competing machine learning methods. We find that contributions from these three types of data are complementary to each other. Moreover, the integrative CNN framework is superior to traditional machine learning methods with significant improvements.
引用
收藏
页码:241 / 252
页数:12
相关论文
共 50 条
  • [1] An Integrative Framework for Combining Sequence and Epigenomic Data to Predict Transcription Factor Binding Sites Using Deep Learning
    Jing, Fang
    Zhang, Shao-Wu
    Cao, Zhen
    Zhang, Shihua
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (01) : 355 - 364
  • [2] Using Deep Learning to Predict Transcription Factor Binding Sites Based on Multiple-omics Data
    Xu, Youhong
    Yuan, Changan
    Wu, Hongjie
    Zhao, Xingming
    INTELLIGENT COMPUTING THEORIES AND APPLICATION (ICIC 2022), PT I, 2022, 13393 : 799 - 810
  • [3] Integrating genome sequence and structural data for statistical learning to predict transcription factor binding sites
    Long, Pengpeng
    Zhang, Lu
    Huang, Bin
    Chen, Quan
    Liu, Haiyan
    NUCLEIC ACIDS RESEARCH, 2020, 48 (22) : 12604 - 12617
  • [4] Combining frequency and positional information to predict transcription factor binding sites
    Kielbasa, SM
    Korbel, JO
    Beule, D
    Schuchhardt, J
    Herzel, H
    BIOINFORMATICS, 2001, 17 (11) : 1019 - 1026
  • [5] Deep learning for inferring transcription factor binding sites
    Koo, Peter K.
    Ploenzke, Matt
    CURRENT OPINION IN SYSTEMS BIOLOGY, 2020, 19 : 16 - 23
  • [6] Predicting Transcription Factor Binding Sites with Deep Learning
    Ghosh, Nimisha
    Santoni, Daniele
    Saha, Indrajit
    Felici, Giovanni
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (09)
  • [7] DeepSTF: predicting transcription factor binding sites by interpretable deep neural networks combining sequence and shape
    Ding, Pengju
    Wang, Yifei
    Zhang, Xinyu
    Gao, Xin
    Liu, Guozhu
    Yu, Bin
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (04)
  • [8] Prediction of Transcription Factor Binding Sites Using a Combined Deep Learning Approach
    Cao, Linan
    Liu, Pei
    Chen, Jialong
    Deng, Lei
    FRONTIERS IN ONCOLOGY, 2022, 12
  • [9] Using Sequence-Specific Chemical and Structural Properties of DNA to Predict Transcription Factor Binding Sites
    Bauer, Amy L.
    Hlavacek, William S.
    Unkefer, Pat J.
    Mu, Fangping
    PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (11)
  • [10] MLSNet: a deep learning model for predicting transcription factor binding sites
    Zhang, Yuchuan
    Wang, Zhikang
    Ge, Fang
    Wang, Xiaoyu
    Zhang, Yiwen
    Li, Shanshan
    Guo, Yuming
    Song, Jiangning
    Yu, Dong-Jun
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)