Application of experimentally verified transcription factor binding sites models for computational analysis of ChIP-Seq data

被引:28
|
作者
Levitsky, Victor G. [1 ,2 ]
Kulakovskiy, Ivan V. [3 ,4 ]
Ershov, Nikita I. [1 ]
Oshchepkov, Dmitry Yu [1 ]
Makeev, Vsevolod J. [3 ,4 ]
Hodgman, T. C. [5 ]
Merkulova, Tatyana I. [1 ,2 ]
机构
[1] Russian Acad Sci, Inst Cytol & Genet, Siberian Div, Lavrentieva Prospect 10, Novosibirsk 630090, Russia
[2] Novosibirsk State Univ, Novosibirsk 630090, Russia
[3] Russian Acad Sci, Engelhardt Inst Mol Biol, Moscow 119991, Russia
[4] Russian Acad Sci, Vavilov Inst Gen Genet, Dept Computat Syst Biol, Moscow 119991, Russia
[5] Univ Nottingham, Sch Biosci, Multidisciplinary Ctr Integrat Biol, Sutton LE12 5RD, Surrey, England
来源
BMC GENOMICS | 2014年 / 15卷
基金
俄罗斯基础研究基金会;
关键词
ChIP-Seq; EMSA; Transcription factor binding sites; FoxA; SiteGA; PWM; Transcription factor binding model; Dinucleotide frequencies; GLUCOCORTICOID-RECEPTOR; MOTIF DISCOVERY; PROTEIN; GENE; ELEMENTS; IDENTIFICATION; FAMILY; C/EBP; HNF3; TRRD;
D O I
10.1186/1471-2164-15-80
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: ChIP-Seq is widely used to detect genomic segments bound by transcription factors (TF), either directly at DNA binding sites (BSs) or indirectly via other proteins. Currently, there are many software tools implementing different approaches to identify TFBSs within ChIP-Seq peaks. However, their use for the interpretation of ChIP-Seq data is usually complicated by the absence of direct experimental verification, making it difficult both to set a threshold to avoid recognition of too many false-positive BSs, and to compare the actual performance of different models. Results: Using ChIP-Seq data for FoxA2 binding loci in mouse adult liver and human HepG2 cells we compared FoxA binding-site predictions for four computational models of two fundamental classes: pattern matching based on existing training set of experimentally confirmed TFBSs (oPWM and SiteGA) and de novo motif discovery (ChIPMunk and diChIPMunk). To properly select prediction thresholds for the models, we experimentally evaluated affinity of 64 predicted FoxA BSs using EMSA that allows safely distinguishing sequences able to bind TF. As a result we identified thousands of reliable FoxA BSs within ChIP-Seq loci from mouse liver and human HepG2 cells. It was found that the performance of conventional position weight matrix (PWM) models was inferior with the highest false positive rate. On the contrary, the best recognition efficiency was achieved by the combination of SiteGA & diChIPMunk/ChIPMunk models, properly identifying FoxA BSs in up to 90% of loci for both mouse and human ChIP-Seq datasets. Conclusions: The experimental study of TF binding to oligonucleotides corresponding to predicted sites increases the reliability of computational methods for TFBS-recognition in ChIP-Seq data analysis. Regarding ChIP-Seq data interpretation, basic PWMs have inferior TFBS recognition quality compared to the more sophisticated SiteGA and de novo motif discovery methods. A combination of models from different principles allowed identification of proper TFBSs.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Application of experimentally verified transcription factor binding sites models for computational analysis of ChIP-Seq data
    Victor G Levitsky
    Ivan V Kulakovskiy
    Nikita I Ershov
    Dmitry Yu Oshchepkov
    Vsevolod J Makeev
    T C Hodgman
    Tatyana I Merkulova
    BMC Genomics, 15
  • [2] FROM BINDING MOTIFS IN CHIP-SEQ DATA TO IMPROVED MODELS OF TRANSCRIPTION FACTOR BINDING SITES
    Kulakovskiy, Ivan
    Levitsky, Victor
    Oshchepkov, Dmitry
    Bryzgalov, Leonid
    Vorontsov, Ilya
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2013, 11 (01)
  • [3] Development of Computational Methods to Search for FoxA Transcription Factor Binding Sites, Their Experimental Verification and Application to the Analysis of Chip-Seq Data
    Levitsky, V. G.
    Oshchepkov, D. Yu.
    Ershov, N. I.
    Bryzgalov, L. O.
    Antontseva, E. V.
    Vasiliev, G. V.
    Merkulova, T. I.
    Kolchanov, N. A.
    DOKLADY BIOCHEMISTRY AND BIOPHYSICS, 2011, 436 (01) : 12 - 15
  • [4] Development of computational methods to search for FoxA transcription factor binding sites, their experimental verification and application to the analysis of ChIP-seq data
    V. G. Levitsky
    D. Yu. Oshchepkov
    N. I. Ershov
    L. O. Bryzgalov
    E. V. Antontseva
    G. V. Vasiliev
    T. I. Merkulova
    N. A. Kolchanov
    Doklady Biochemistry and Biophysics, 2011, 436 : 12 - 15
  • [5] On the detection and refinement of transcription factor binding sites using ChIP-Seq data
    Hu, Ming
    Yu, Jindan
    Taylor, Jeremy M. G.
    Chinnaiyan, Arul M.
    Qin, Zhaohui S.
    NUCLEIC ACIDS RESEARCH, 2010, 38 (07) : 2154 - 2167
  • [6] Pinpointing transcription factor binding sites from ChIP-seq data with SeqSite
    Wang, Xi
    Zhang, Xuegong
    BMC SYSTEMS BIOLOGY, 2011, 5
  • [7] Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data
    Valouev, Anton
    Johnson, David S.
    Sundquist, Andreas
    Medina, Catherine
    Anton, Elizabeth
    Batzoglou, Serafim
    Myers, Richard M.
    Sidow, Arend
    NATURE METHODS, 2008, 5 (09) : 829 - 834
  • [8] Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data
    Valouev A.
    Johnson D.S.
    Sundquist A.
    Medina C.
    Anton E.
    Batzoglou S.
    Myers R.M.
    Sidow A.
    Nature Methods, 2008, 5 (9) : 829 - 834
  • [9] Identification of transcription factor binding sites from ChIP-seq data at high resolution
    Bardet, Anais F.
    Steinmann, Jonas
    Bafna, Sangeeta
    Knoblich, Juergen A.
    Zeitlinger, Julia
    Stark, Alexander
    BIOINFORMATICS, 2013, 29 (21) : 2705 - 2713
  • [10] Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment
    Rebecca Worsley Hunt
    Anthony Mathelier
    Luis del Peso
    Wyeth W Wasserman
    BMC Genomics, 15