High-resolution transcription factor binding sites prediction improved performance and interpretability by deep learning method

被引:19
|
作者
Zhang, Yongqing [1 ]
Wang, Zixuan [2 ]
Zeng, Yuanqi [2 ]
Zhou, Jiliu [1 ]
Zou, Quan [3 ]
机构
[1] Chengdu Univ Informat Technol, Sch Comp Sci, Chengdu 610225, Peoples R China
[2] Chengdu Univ Informat Technol, Comp Sci, Chengdu, Peoples R China
[3] Univ Elect Sci & Technol China, Chengdu, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
transcription factor binding sites; Attention Gate; interpretability; motif discovery; DNA;
D O I
10.1093/bib/bbab273
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Transcription factors (TFs) are essential proteins in regulating the spatiotemporal expression of genes. It is crucial to infer the potential transcription factor binding sites (TFBSs) with high resolution to promote biology and realize precision medicine. Recently, deep learning-based models have shown exemplary performance in the prediction of TFBSs at the base-pair level. However, the previous models fail to integrate nucleotide position information and semantic information without noisy responses. Thus, there is still room for improvement. Moreover, both the inner mechanism and prediction results of these models are challenging to interpret. To this end, the Deep Attentive Encoder-Decoder Neural Network (D-AEDNet) is developed to identify the location of TFs-DNA binding sites in DNA sequences. In particular, our model adopts Skip Architecture to leverage the nucleotide position information in the encoder and removes noisy responses in the information fusion process by Attention Gate. Simultaneously, the Transcription Factor Motif Discovery based on Sliding Window (TF-MoDSW), an approach to discover TFs-DNA binding motifs by utilizing the output of neural networks, is proposed to understand the biological meaning of the predicted result. On ChIP-exo datasets, experimental results show that D-AEDNet has better performance than competing methods. Besides, we authenticate that Attention Gate can improve the interpretability of our model by ways of visualization analysis. Furthermore, we confirm that ability of D-AEDNet to learn TFs-DNA binding motifs outperform the state-of-the-art methods and availability of TF-MoDSW to discover biological sequence motifs in TFs-DNA interaction by conducting experiment on ChIP-seq datasets.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Combining Sequence and Epigenomic Data to Predict Transcription Factor Binding Sites Using Deep Learning
    Jing, Fang
    Zhang, Shao-Wu
    Cao, Zhen
    Zhang, Shihua l
    BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2018, 2018, 10847 : 241 - 252
  • [42] HIGH-RESOLUTION LOCALIZATION OF DRUG-BINDING SITES
    HOFF, SF
    MACINNIS, AJ
    JOURNAL OF ULTRASTRUCTURE RESEARCH, 1981, 74 (02): : 205 - 216
  • [43] Identification of transcription factor binding sites from ChIP-seq data at high resolution
    Bardet, Anais F.
    Steinmann, Jonas
    Bafna, Sangeeta
    Knoblich, Juergen A.
    Zeitlinger, Julia
    Stark, Alexander
    BIOINFORMATICS, 2013, 29 (21) : 2705 - 2713
  • [44] High-Resolution Mapping of In vivo Genomic Transcription Factor Binding Sites Using In situ DNase I Footprinting and ChIP-seq
    Chumsakul, Onuma
    Nakamura, Kensuke
    Kurata, Tetsuya
    Sakamoto, Tomoaki
    Hobman, Jon L.
    Ogasawara, Naotake
    Oshima, Taku
    Ishikawa, Shu
    DNA RESEARCH, 2013, 20 (04) : 325 - 337
  • [45] A non-parametric high-resolution prediction method for turbine blade profile loss based on deep learning
    Li, Lele
    Zhang, Weihao
    Li, Ya
    Zhang, Ruifeng
    Liu, Zongwang
    Wang, Yufan
    Mu, Yumo
    ENERGY, 2024, 288
  • [46] Efficient High-Resolution Deep Learning: A Survey
    Bakhtiarnia, Arian
    Zhang, Qi
    Iosifidis, Alexandros
    ACM COMPUTING SURVEYS, 2024, 56 (07)
  • [47] Deep learning for high-resolution seismic imaging
    Ma L.
    Han L.
    Feng Q.
    Scientific Reports, 14 (1)
  • [48] Deep learning for high-resolution seismic imaging
    Ma, Liyun
    Han, Liguo
    Feng, Qiang
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [49] NetTIME: a multitask and base-pair resolution framework for improved transcription factor binding site prediction
    Yi, Ren
    Cho, Kyunghyun
    Bonneau, Richard
    BIOINFORMATICS, 2022, 38 (20) : 4762 - 4770
  • [50] Performance comparison of algorithms for finding transcription factor binding sites
    Sinha, S
    Tompa, M
    THIRD IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING - BIBE 2003, PROCEEDINGS, 2003, : 214 - 220