Identifying statistically significant chromatin contacts from Hi-C data with FitHiC2

被引:117
|
作者
Kaul, Arya [1 ,4 ]
Bhattacharyya, Sourya [2 ]
Ay, Ferhat [2 ,3 ]
机构
[1] Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
[2] La Jolla Inst Immunol, Div Vaccine Discovery, La Jolla, CA 92037 USA
[3] Univ Calif San Diego, Sch Med, La Jolla, CA 92093 USA
[4] Harvard Med Sch, Dept Biomed Informat, Boston, MA 02115 USA
关键词
REVEALS; GENOME; ORGANIZATION; PRINCIPLES; MODEL; MAP;
D O I
10.1038/s41596-019-0273-0
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Fit-Hi-C is a programming application to compute statistical confidence estimates for Hi-C contact maps to identify significant chromatin contacts. By fitting a monotonically non-increasing spline, Fit-Hi-C captures the relationship between genomic distance and contact probability without any parametric assumption. The spline fit together with the correction of contact probabilities with respect to bin- or locus-specific biases accounts for previously characterized covariates impacting Hi-C contact counts. Fit-Hi-C is best applied for the study of mid-range (e.g., 20 kb-2 Mb for human genome) intra-chromosomal contacts; however, with the latest reimplementation, named FitHiC2, it is possible to perform genome-wide analysis for high-resolution Hi-C data, including all intra-chromosomal distances and inter-chromosomal contacts. FitHiC2 also offers a merging filter module, which eliminates indirect/bystander interactions, leading to significant reduction in the number of reported contacts without sacrificing recovery of key loops such as those between convergent CTCF binding sites. Here, we describe how to apply the FitHiC2 protocol to three use cases: (i) 5-kb resolution Hi-C data of chromosome 5 from GM12878 (a human lymphoblastoid cell line), (ii) 40-kb resolution whole-genome Hi-C data from IMR90 (human lung fibroblast), and (iii) budding yeast whole-genome Hi-C data at a single restriction cut site (EcoRI) resolution. The procedure takes 12 h with preprocessing when all use cases are run sequentially (4 h when run parallel). With the recent improvements in its implementation, FitHiC2 (8 processors and 16 GB memory) is also scalable to genome-wide analysis of the highest resolution (1 kb) Hi-C data available to date (48 h with 32 GB peak memory). FitHiC2 is available through Bioconda, GitHub and the Python Package Index. Fit-Hi-C is a computational tool for identifying statistically significant contacts from Hi-C data. This protocol describes how to apply the new version, called FitHiC2, on high-resolution Hi-C data, demonstrating the added functionalities.
引用
收藏
页码:991 / 1012
页数:22
相关论文
共 50 条
  • [31] Identification of significant chromatin contacts from HiChIP data by FitHiChIP
    Sourya Bhattacharyya
    Vivek Chandra
    Pandurangan Vijayanand
    Ferhat Ay
    Nature Communications, 10
  • [32] Chromatin conformation changes in focal lesional epilepsy: Insights from Hi-C analysis
    Karandasheva, K.
    Bluemcke, I.
    Kobow, K.
    EPILEPSIA, 2024, 65 : 294 - 294
  • [33] GILoop: Robust chromatin loop calling across multiple sequencing depths on Hi-C data
    Wang, Fuzhou
    Gao, Tingxiao
    Lin, Jiecong
    Zheng, Zetian
    Huang, Lei
    Toseef, Muhammad
    Li, Xiangtao
    Wong, Ka -Chun
    ISCIENCE, 2022, 25 (12)
  • [34] Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages
    Lan, Xun
    Witt, Heather
    Katsumura, Koichi
    Ye, Zhenqing
    Wang, Qianben
    Bresnick, Emery H.
    Farnham, Peggy J.
    Jin, Victor X.
    NUCLEIC ACIDS RESEARCH, 2012, 40 (16) : 7690 - 7704
  • [35] Identifying chromatin interactions between psoriasis-associated variants and target genes using Capture Hi-C
    Ray-Jones, H.
    McGovern, A.
    Martin, P.
    Duffus, K.
    Eyre, S.
    Warren, R. B.
    BRITISH JOURNAL OF DERMATOLOGY, 2017, 177 (05) : E241 - E242
  • [36] Identifying quantitatively differential chromosomal compartmentalization changes and their biological significance from Hi-C data using DARIC
    Kai, Yan
    Liu, Nan
    Orkin, Stuart H.
    Yuan, Guo-Cheng
    BMC GENOMICS, 2023, 24 (01)
  • [37] Identifying quantitatively differential chromosomal compartmentalization changes and their biological significance from Hi-C data using DARIC
    Yan Kai
    Nan Liu
    Stuart H. Orkin
    Guo-Cheng Yuan
    BMC Genomics, 24
  • [38] Binless normalization of Hi-C data provides significant interaction and difference detection independent of resolution
    Yannick G. Spill
    David Castillo
    Enrique Vidal
    Marc A. Marti-Renom
    Nature Communications, 10
  • [39] Binless normalization of Hi-C data provides significant interaction and difference detection independent of resolution
    Spill, Yannick G.
    Castillo, David
    Vidal, Enrique
    Marti-Renom, Marc A.
    NATURE COMMUNICATIONS, 2019, 10 (1)
  • [40] BART3D: inferring transcriptional regulators associated with differential chromatin interactions from Hi-C data
    Wang, Zhenjia
    Zhang, Yifan
    Zang, Chongzhi
    BIOINFORMATICS, 2021, 37 (18) : 3075 - 3078