Dinucleotide Weight Matrices for Predicting Transcription Factor Binding Sites: Generalizing the Position Weight Matrix

被引:63
|
作者
Siddharthan, Rahul [1 ]
机构
[1] Inst Math Sci, Madras 600113, Tamil Nadu, India
来源
PLOS ONE | 2010年 / 5卷 / 03期
关键词
NONADDITIVITY; SEQUENCES; DATABASE; CODE;
D O I
10.1371/journal.pone.0009722
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Identifying transcription factor binding sites (TFBS) in silico is key in understanding gene regulation. TFBS are string patterns that exhibit some variability, commonly modelled as "position weight matrices" (PWMs). Though convenient, the PWM has significant limitations, in particular the assumed independence of positions within the binding motif; and predictions based on PWMs are usually not very specific to known functional sites. Analysis here on binding sites in yeast suggests that correlation of dinucleotides is not limited to near-neighbours, but can extend over considerable gaps. Methodology/Principal Findings: I describe a straightforward generalization of the PWM model, that considers frequencies of dinucleotides instead of individual nucleotides. Unlike previous efforts, this method considers all dinucleotides within an extended binding region, and does not make an attempt to determine a priori the significance of particular dinucleotide correlations. I describe how to use a "dinucleotide weight matrix" (DWM) to predict binding sites, dealing in particular with the complication that its entries are not independent probabilities. Benchmarks show, for many factors, a dramatic improvement over PWMs in precision of predicting known targets. In most cases, significant further improvement arises by extending the commonly defined "core motifs" by about 10bp on either side. Though this flanking sequence shows no strong motif at the nucleotide level, the predictive power of the dinucleotide model suggests that the "signature" in DNA sequence of protein-binding affinity extends beyond the core protein-DNA contact region. Conclusion/Significance: While computationally more demanding and slower than PWM-based approaches, this dinucleotide method is straightforward, both conceptually and in implementation, and can serve as a basis for future improvements.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Parallel Position Weight Matrices Algorithms
    Giraud, Mathieu
    Varre, Jean-Stephane
    EIGHTH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING, PROCEEDINGS, 2009, : 65 - 72
  • [32] Position specific variation in the rate of evolution in transcription factor binding sites
    Alan M Moses
    Derek Y Chiang
    Manolis Kellis
    Eric S Lander
    Michael B Eisen
    BMC Evolutionary Biology, 3
  • [33] Position specific variation in the rate of evolution in transcription factor binding sites
    Moses, AM
    Chiang, DY
    Kellis, M
    Lander, ES
    Eisen, MB
    BMC EVOLUTIONARY BIOLOGY, 2003, 3 (1)
  • [34] Structure-based prediction of transcription factor specificity: Comparison to position weight matrix and in vitro prediction methods
    Torella, Rubben F.
    Bray, Sarah J.
    Adryan, Boris
    Glen, Robert C.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 243
  • [35] MLSNet: a deep learning model for predicting transcription factor binding sites
    Zhang, Yuchuan
    Wang, Zhikang
    Ge, Fang
    Wang, Xiaoyu
    Zhang, Yiwen
    Li, Shanshan
    Guo, Yuming
    Song, Jiangning
    Yu, Dong-Jun
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)
  • [36] Toward an atomistic model for predicting transcription-factor binding sites
    Endres, RG
    Schulthess, TC
    Wingreen, NS
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 57 (02) : 262 - 268
  • [37] Evolving Spiking Neural Networks for Predicting Transcription Factor Binding Sites
    Sichtig, Heike
    Schaffer, J. David
    Riva, Alberto
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [38] Assessing phylogenetic motif models for predicting transcription factor binding sites
    Hawkins, John
    Grant, Charles
    Noble, William Stafford
    Bailey, Timothy L.
    BIOINFORMATICS, 2009, 25 (12) : I339 - I347
  • [39] Large scale matching for position weight matrices
    Liefooghe, Aude
    Touzet, Helene
    Varre, Jean-Stephane
    COMBINATORIAL PATTERN MATCHING, PROCEEDINGS, 2006, 4009 : 401 - 412
  • [40] TFinder: A Python']Python Web Tool for Predicting Transcription Factor Binding Sites
    Minniti, Julien
    Checler, Frederic
    Duplan, Eric
    da Costa, Cristine Alves
    JOURNAL OF MOLECULAR BIOLOGY, 2025, 437 (03)