An efficient method to transcription factor binding sites imputation via simultaneous completion of multiple matrices with positional consistency

被引:17
|
作者
Guo, Wei-Li [1 ]
Huang, De-Shuang [1 ]
机构
[1] Tongji Univ, Inst Machine Learning & Syst Biol, Sch Elect & Informat Engn, Shanghai 201804, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
CHIP-SEQ; DNA-BINDING; ENCODE; DISCOVERY; NETWORKS; MOTIFS;
D O I
10.1039/c7mb00155j
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Transcription factors (TFs) are DNA-binding proteins that have a central role in regulating gene expression. Identification of DNA-binding sites of TFs is a key task in understanding transcriptional regulation, cellular processes and disease. Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) enables genome-wide identification of in vivo TF binding sites. However, it is still difficult to map every TF in every cell line owing to cost and biological material availability, which poses an enormous obstacle for integrated analysis of gene regulation. To address this problem, we propose a novel computational approach, TFBSImpute, for predicting additional TF binding profiles by leveraging information from available ChIP-seq TF binding data. TFBSImpute fuses the dataset to a 3-mode tensor and imputes missing TF binding signals via simultaneous completion of multiple TF binding matrices with positional consistency. We show that signals predicted by our method achieve overall similarity with experimental data and that TFBSImpute significantly outperforms baseline approaches, by assessing the performance of imputation methods against observed ChIP-seq TF binding profiles. Besides, motif analysis shows that TFBSImpute preforms better in capturing binding motifs enriched in observed data compared with baselines, indicating that the higher performance of TFBSImpute is not simply due to averaging related samples. We anticipate that our approach will constitute a useful complement to experimental mapping of TF binding, which is beneficial for further study of regulation mechanisms and disease.
引用
收藏
页码:1827 / 1837
页数:11
相关论文
共 50 条
  • [1] Positional distribution of human transcription factor binding sites
    Koudritsky, Mark
    Domany, Eytan
    NUCLEIC ACIDS RESEARCH, 2008, 36 (21) : 6795 - 6805
  • [2] Positional distribution of transcription factor binding sites in Arabidopsis thaliana
    Yu, Chun-Ping
    Lin, Jinn-Jy
    Li, Wen-Hsiung
    SCIENTIFIC REPORTS, 2016, 6
  • [3] Positional distribution of transcription factor binding sites in Arabidopsis thaliana
    Chun-Ping Yu
    Jinn-Jy Lin
    Wen-Hsiung Li
    Scientific Reports, 6
  • [4] An efficient method for statistical significance calculation of transcription factor binding sites
    Qian, Ziliang
    Lu, Lingyi
    Qi, Liu
    Li, Yixue
    BIOINFORMATION, 2007, 2 (05) : 169 - 174
  • [5] Transcription factor binding sites recognition by the regularities matrices based on the natural classification method
    Vityaev, E. E.
    Lapardin, K. A.
    Khomicheva, I., V
    Levitsky, V. G.
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE, VOL 1, 2006, : 199 - +
  • [6] Combining frequency and positional information to predict transcription factor binding sites
    Kielbasa, SM
    Korbel, JO
    Beule, D
    Schuchhardt, J
    Herzel, H
    BIOINFORMATICS, 2001, 17 (11) : 1019 - 1026
  • [7] Similarity of position frequency matrices for transcription factor binding sites
    Schones, DE
    Sumazin, P
    Zhang, MQ
    BIOINFORMATICS, 2005, 21 (03) : 307 - 313
  • [8] Effect of positional dependence and alignment strategy on modeling transcription factor binding sites
    Quader S.
    Huang C.-H.
    BMC Research Notes, 5 (1)
  • [9] Mutex: a method for simultaneous footprinting and determination of base pair specificity for transcription factor binding sites
    Denkinger, DJ
    Kawahara, RS
    ANALYTICAL BIOCHEMISTRY, 2003, 321 (01) : 142 - 145
  • [10] Recognition of transcription factor binding sites by the SiteGA method
    Levitskii V.G.
    Ignat'eva E.V.
    Anan'ko E.A.
    Merkulova T.I.
    Kolchanov N.A.
    Hodgman C.
    Biophysics, 2006, 51 (4) : 565 - 570