A penalized Bayesian approach to predicting sparse protein-DNA binding landscapes

被引:4
|
作者
Levinson, Matthew [1 ]
Zhou, Qing [1 ]
机构
[1] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90095 USA
基金
美国国家科学基金会;
关键词
TRANSCRIPTION-FACTOR-BINDING; NETWORK; PLURIPOTENCY; GENE; DISCOVERY; PATTERNS; MODELS;
D O I
10.1093/bioinformatics/btt585
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Cellular processes are controlled, directly or indirectly, by the binding of hundreds of different DNA binding factors (DBFs) to the genome. One key to deeper understanding of the cell is discovering where, when and how strongly these DBFs bind to the DNA sequence. Direct measurement of DBF binding sites (BSs; e.g. through ChIP-Chip or ChIP-Seq experiments) is expensive, noisy and not available for every DBF in every cell type. Naive and most existing computational approaches to detecting which DBFs bind in a set of genomic regions of interest often perform poorly, due to the high false discovery rates and restrictive requirements for prior knowledge. Results: We develop SparScape, a penalized Bayesian method for identifying DBFs active in the considered regions and predicting a joint probabilistic binding landscape. Using a sparsity-inducing penalization, SparScape is able to select a small subset of DBFs with enriched BSs in a set of DNA sequences from a much larger candidate set. This substantially reduces the false positives in prediction of BSs. Analysis of ChIP-Seq data in mouse embryonic stem cells and simulated data show that SparScape dramatically outperforms the naive motif scanning method and the comparable computational approaches in terms of DBF identification and BS prediction.
引用
收藏
页码:636 / 643
页数:8
相关论文
共 50 条
  • [31] PiDNA: predicting protein-DNA interactions with structural models
    Lin, Chih-Kang
    Chen, Chien-Yu
    NUCLEIC ACIDS RESEARCH, 2013, 41 (W1) : W523 - W530
  • [32] Predicting Binding Free Energies for DPS Protein-DNA Complexes and Crystals Using Molecular Dynamics
    Tereshkin E.V.
    Tereshkina K.B.
    Krupyanskii Y.F.
    Supercomputing Frontiers and Innovations, 2022, 9 (02) : 33 - 45
  • [33] Structural predictions of protein-DNA binding: MELD-DNA
    Esmaeeli, Reza
    Bauza, Antonio
    Perez, Alberto
    NUCLEIC ACIDS RESEARCH, 2023, 51 (04) : 1625 - 1636
  • [34] Protein-DNA binding: complexities and multi-protein codes
    Siggers, Trevor
    Gordan, Raluca
    NUCLEIC ACIDS RESEARCH, 2014, 42 (04) : 2099 - 2111
  • [35] Parametric modeling of protein-DNA binding kinetics: A discrete event based simulation approach
    Ghosh, Preetam
    Ghosh, Samik
    Basu, Kalyan
    Das, Sajal
    DISCRETE APPLIED MATHEMATICS, 2009, 157 (10) : 2395 - 2415
  • [36] Computationally identifying hot spots in protein-DNA binding interfaces using an ensemble approach
    Yuliang Pan
    Shuigeng Zhou
    Jihong Guan
    BMC Bioinformatics, 21
  • [37] Computationally identifying hot spots in protein-DNA binding interfaces using an ensemble approach
    Pan, Yuliang
    Zhou, Shuigeng
    Guan, Jihong
    BMC BIOINFORMATICS, 2020, 21 (Suppl 13)
  • [38] A feature-based approach to predict hot spots in protein-DNA binding interfaces
    Zhang, Sijia
    Zhao, Le
    Zheng, Chun-Hou
    Xia, Junfeng
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (03) : 1038 - 1046
  • [39] A PERMUTATIONAL APPROACH TOWARD PROTEIN-DNA RECOGNITION
    HUANG, LX
    SERA, T
    SCHULTZ, PG
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (09) : 3969 - 3973
  • [40] Comment on "Deficiencies in Molecular Dynamics Simulation-Based Prediction of Protein-DNA Binding Free Energy Landscapes"
    Gapsys, Vytautas
    Khabiri, Morteza
    de Groot, Bert L.
    Freddolino, Peter L.
    JOURNAL OF PHYSICAL CHEMISTRY B, 2020, 124 (06): : 1115 - 1123