Skip to content

Latest commit

 

History

History
515 lines (351 loc) · 99 KB

README.md

File metadata and controls

515 lines (351 loc) · 99 KB

Hi-C data

License: MIT PR's Welcome

A (continuously updated) collection of references to Hi-C data and papers. Predominantly human/mouse Hi-C data, with replicates. Please, contribute and get in touch! See MDmisc notes for other programming and genomics-related notes.

Large collections

  • HiChIPdb - database of H3K27ac HiChIP experiments, human. 200 high-throughput HiChIP samples across 108 cell types. Interactions annotated with regulatory genes and GWAS catalog SNPs. Uniform processing with HiC-Pro, hg19, loop calling with FitHiChIP and Hichipper. Download of subsets and all data.
    Paper Zeng, Wanwen, Qiao Liu, Qijin Yin, Rui Jiang, and Wing Hung Wong. “HiChIPdb: A Comprehensive Database of HiChIP Regulatory Interactions.” Nucleic Acids Research, October 10, 2022, gkac859. https://doi.org/10.1093/nar/gkac859.
  • 3DIV - database of uniformly processed 315 Hi-C datasets, 80 human cell/tissue types. Bait-centric (SNP rsID, gene name, hg19 coordinates) visualization of long-range interactions in context of epigenomic (histone, enhancers) signals, numerical results. Custom BWA-MEM pipeline, Bias, distance effect removed. Coordinates of significant interactions, with annotations, are available for (FTP) download, http://kobic.kr/3div/download

    • Yang, Dongchan, Insu Jang, Jinhyuk Choi, Min-Seo Kim, Andrew J Lee, Hyunwoong Kim, Junghyun Eom, Dongsup Kim, Inkyung Jung, and Byungwook Lee. “3DIV: A 3D-Genome Interaction Viewer and Database.” Nucleic Acids Research 46, no. D1 (January 4, 2018)
  • Chorogenome resource: Processed data (Hi-C, ChIP-seq) for Drosophila, Mouse, Human, http://chorogenome.ie-freiburg.mpg.de:5003/

  • GITAR: An Open Source Tool for Analysis and Visualization of Hi-C Data - Includes a large collection of standardized processed data from 4D Nucleome. 20 hg38 and 2 mm10 datasets normalized by Yaffe-Tanay method, downloadable, include directionality index, HMM states, TAD analysis results. Text and HDF5 formats. https://www.genomegitar.org/processed-data.html

  • 4DGenome - 3D significant interactions, from different literature sources

  • A catalog of TADs, TAD boundaries (18,972 total, 2,293 novel, Arrowhead and Insulation score), and loops (21,838, HiCCUPS, cooltools call-dots) in human lymphoblastoma cell lines (LCL). Hi-C data on the 1000 genomes individuals (44 different individuals from five super populations), including data from the Human Genome Structural Variation Consortium (HGSVC) and 4DNucleome. The impact of SVs overlapping TAD boundaries on gene expression and splicing. Introduction about 3D genome rewiring events in disease. Juicer, FAN-C, cooltools. GitHub. No data/supplementary until published.

    Paper Li, Chong, Marc Jan Bonder, Sabriya Syed, Human Genome Structural Variation Consortium (HGSVC), HGSVC Functional Analysis Working Group, Michael C. Zody, Mark J.P. Chaisson, et al. “A Comprehensive Catalog of 3D Genome Organization in Diverse Human Genomes Facilitates Understanding of the Impact of Structural Variation on Chromatin Structure.” Preprint. Genomics, May 15, 2023. https://doi.org/10.1101/2023.05.15.540856.
  • TADKB - TAD database for 11 cell types, human (GM12878, HMEC, NHEK, IMR90, KBM7, K562, and HUVEC) and mouse (CH12-LX, ES, NPC, and CN). Information about genes and lncRNAs in each TAD. 3D structures for each TAD, classification of TADs by structural similarity. Browsing by coordinate, family, across cell types, search for gene. Download of TAD coordinates (hg19, space-separated).
    Paper Liu, Tong, Jacob Porter, Chenguang Zhao, Hao Zhu, Nan Wang, Zheng Sun, Yin-Yuan Mo, and Zheng Wang. “TADKB: Family Classification and a Knowledge Base of Topologically Associating Domains.” BMC Genomics 20, no. 1 (December 2019): 217. https://doi.org/10.1186/s12864-019-5551-2.

4D Nucleome

  • 4D Nucleome Data Portal - 3D genomics and microscopy data, uniformly processed, integrative visualization in HiGlass, comparative functionality. Browse by type (sequencing, microscopy) or publication. Data are in three tiers: Tier 1 (H1-ESC, GM12878, IMR90, HFF-hTERT (clone 6), and WTC-11), Tier 2 and untiered. Overview of first and second phases of the 4DN project. Other repositories that host Hi-C and similar datasets include the ENCODE portal, NCBI's GEO and EMBL-EBI’s ArrayExpress. 4D Nucleome Browser for integrative and multimodal data navigation.
    • Table 1 - Genomic assay types in the 4D Nucleome Data Portal. Chromatin conformation data (In situ, dilution Hi-C, Micro-C, DNase Hi-C, Hi-C 3.0, Capture Hi-C, TCC, single-cell variants, SPRITE, GAM), and related sequqncing data (ChIA-PET, ChIA-Drop, PLAC-seq, ChIP-seq, CUT&RUN, Repli-seq, MARGI (RNA-chromatin interactions), others).
    • High-resolution Hi-C datasets, over 1 billion read pairs. cooltoolsprocessing, .cool and .mcool formats, A/B compartments and TAD boundaries (insulation score) detected using domain calling pipelines.
    • Microscopy datasets - standard FISH (DNA or RNA), multi-loci FISH, high-throughput FISH, dynamic single particle tracking, ChromEMT, OptoDroplet.
    • Table 2 - All 4D Nucleome analysis pipelines, in CWL, WDL, available on Docker Hub. Alignment with BWA MEM with the -SP5M option. PairsQC - QC report for Hi-C pairs files. Hi-C processing pipeline.
    • 4DN Visualization Workspace Paper Reiff, S.B., Schroeder, A.J., Kırlı, K. et al. The 4D Nucleome Data Portal as a resource for searching and visualizing curated nucleomics data. Nat Commun 13, 2365 (02 May 2022). https://doi.org/10.1038/s41467-022-29697-4

Lieberman-Aiden lab

All HiC data released by Lieberman-Aiden group. Links to Amazon storage and GEO studies. http://aidenlab.org/data.html

  • A/B compartment inverstigation at ultra-high resolution (500bp, GM12878 and other lymphoblastoid cells, MboI, MseI, and NlaIII restriction enzymes, 75bp cutting on average, 33bn contacts). POSSUMM algorithm for perfprming principal component analysis on sparse, super-massive matrices (Lanczos-like method, much faster than CscoreTool), can calculate genome-wide eigenvectors. Active promoters and enhancers tend to localize in the A compartment. TSS and TTS sequences of paused genes often segregate into separate compartments. B compartments are mostly quiescent chromatin, overlap with Lamin Associated domains. MiChroM simulation. Loops can be separated into punctate (non-loop-extrusion mechanism) and diffuse, the latter correspond to strong proximal enhancer-promoter interactions. A model unifying different levels of genome organization (A/B compartments, TAD boundaries, loope), orientation of TSSs to the A compartment and TTSs to the B compartment. Interactive maps at https://tinyurl.com/2ew48yof, https://tinyurl.com/2mthqtjk, https://tinyurl.com/2f2sfp3a. HiCSampler - subsample .hic file using straw and juicer_tools, HiCNoiseMeasurer - autocorrelation function noise measurements, NuChroM - nucleosome resolution chromatin model, Eigenvector - POSSUMM algorithm, R and C/C++ functions to compute a few leading eigenvectors of the correlation matrices of large sparse matrices. Data links.
    Paper Harris, Hannah L., Huiya Gu, Moshe Olshansky, Ailun Wang, Irene Farabella, Yossi Eliaz, Achyuth Kalluchi, et al. “Chromatin Alternates between A and B Compartments at Kilobase Scale for Subgenic Organization.” Nature Communications 14, no. 1 (June 6, 2023): 3303. https://doi.org/10.1038/s41467-023-38429-1.
  • Vian, Laura, Aleksandra Pękowska, Suhas S.P. Rao, Kyong-Rim Kieffer-Kwon, Seolkyoung Jung, Laura Baranello, Su-Chen Huang, et al. “The Energetics and Physiological Impact of Cohesin Extrusion.” Cell 173, no. 5 (May 2018) - Architectural stripes, created by extensive loading of cohesin near CTCF anchors, with Nipbl and Rad21 help. Little overlap between B cells and ESCs. Architectural stripes are sites for tumor-inducing TOP2beta DNA breaks. ATP is required for loop extrusion, cohesin translocation, but not required for maintenance, Replication of transcription is not important for loop extrusion. Zebra algorithm for detecting architectural stripes, image analysis, math in Methods. Human lymphoblastoid cells, mouse ESCs, mouse B-cells activated with LPS, CH12 B lymphoma cells, wild-type, treated with hydroxyurea (blocks DNA replication), flavopiridol (blocks transcription, PolII elongation), oligomycin (blocks ATP). Many other data types (e.g., ChIP-seq, ATAC-seq) GSE82144GSE98119

  • Lieberman-Aiden, Erez, Nynke L. van Berkum, Louise Williams, Maxim Imakaev, Tobias Ragoczy, Agnes Telling, Ido Amit, et al. “Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome.” Science (New York, N.Y.) 326, no. 5950 (October 9, 2009) Gm12878, K562 cells. HindIII, NcoI enzymes. Two-three replicates. GSE18199

  • Rao, Suhas S. P., Miriam H. Huntley, Neva C. Durand, Elena K. Stamenova, Ivan D. Bochkov, James T. Robinson, Adrian L. Sanborn, et al. “A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping.” Cell 159, no. 7 (December 18, 2014) - Human Gm12878, K562, IMR90, NHEC, HeLa cells, Mouse CH12 cells. Different digestion enzymes (HindIII, NcoI, Mbol, DpnII), different dilutions. Up to 35 biological replicates for Gm12878. GSE63525, Supplementary Table S1. Hi-C meta-data

  • Sanborn, Adrian L., Suhas S. P. Rao, Su-Chen Huang, Neva C. Durand, Miriam H. Huntley, Andrew I. Jewett, Ivan D. Bochkov, et al. “Chromatin Extrusion Explains Key Features of Loop and Domain Formation in Wild-Type and Engineered Genomes.” Proceedings of the National Academy of Sciences of the United States of America 112, no. 47 (November 24, 2015). HAP1, derived from chronic myelogenous leukemia cell line. Replicates. GSE74072

  • Rao, Suhas S.P., Su-Chen Huang, Brian Glenn St Hilaire, Jesse M. Engreitz, Elizabeth M. Perez, Kyong-Rim Kieffer-Kwon, Adrian L. Sanborn, et al. “Cohesin Loss Eliminates All Loop Domains.” Cell 171, no. 2 (2017) - HCT-116 human colorectal carcinoma cells. Timecourse, replicates under different conditions. GSE104334

Leonid Mirny lab

http://mirnylab.mit.edu/

Bing Ren lab

http://chromosome.sdsc.edu/mouse/hi-c/download.html

Raw and normalized chromatin interaction matrices and TADs defined with DomainCaller. Mouse ES, cortex, Human ES, IMR90 fibroblasts. Two replicates per condition. GEO accession: GSE35156, GSE43070

Feng Yue lab

Cancer

  • Doxorubicin effect on 3D Chromation (also, additional Top2 inhibitors, ICRF193). The human retinal pigment epithelial RPE-1 cells (WT and 18h post-treatment), Hi-C (Juicer, cooltools, TopDom), ChIP-seq (CTCF, RAD21, H3K27ac), RNA-seq. Reduction of local interactions at active promoters, increase in CTCF binding and redistribution of RAD21 around H3K27ac. Differential region analysis (10kb sliding window, log2 fold change, hg19 only) Python script, no enrichment of differential Hi-C regions in differential genes, many differential Hi-C regions do not overlap differential CTCF, RAD21 redistributes around H3K27ac. Hi-C, ChIP-seq and RNA-seq data are at GSE215325.
    Paper Stefanova, Maria E., Elizabeth Ing-Simmons, Stefan Stefanov, Ilya Flyamer, Heathcliff Dorado Garcia, Robert Schöpflin, Anton G. Henssen, Juan M. Vaquerizas, and Stefan Mundlos. “Doxorubicin Changes the Spatial Organization of the Genome around Active Promoters.” Cells 12, no. 15 (August 4, 2023): 2001. https://doi.org/10.3390/cells12152001.
  • 3D genomics of MYC overexpression. MYC overexpression leads to increased binding at active enhancers, amplified gene expression, increased chromatin interactions, promoter-enhancers, weakened TAD boundaries. U2OS osteosarcoma human cell line with tetracycline-inducible MYC, ChIP-seq (H3K27ac, superenhancer detection), RNA-seq (more downregulated genes, activation of ribosome, translation, motochondrial biogenesis), 4D-seq, and SIQHiC (Spike-in Quantitative Hi-C, mixing in crosslinked mouse 3T3 cells at a ratio 1:4). Replicate data at GSE164777.
    Paper See, Yi Xiang, Kaijing Chen, and Melissa J Fullwood. “MYC Overexpression Leads to Increased Chromatin Interactions at Superenhancers and MYC Binding Sites.” Genome Research, February 3, 2022, gr.276313.121. https://doi.org/10.1101/gr.276313.121.
  • Changes in 3D genome are associated with CNVs in multiple myeloma cells (RPMI-8226 trt- and tetraploid, U266 nearly diploid). The number of TADs increases by ~25%, they become smaller, ~20% switch compartment. ICE normalization better accounts for CNVs than HiCNorm. CNV breakpoints overlap with TAD boundaries. 40kb resolution, replicates. Code, Hi-C, WGS, RNA-seq data GSE87585

  • Curtaxins drugs affect 3D genome by DNA intercalation but without inducing DNA damage, compromise enhancer-promoter interactions, suppress oncogene expression, including MYC family genes, downregulates survival genes, partially disrupt TAD borders, decreases short-range interactions, the level of spatial segregation of the A/B compartments, depletes CTCF but not other factors. Hi-C in HT1080 fibrosarcoma cells. Data: Hi-C and CTCF ChIP-seq in duplicates GSE122463, gene expression in MM1.S and HeLa S3 cells GSE117611, H3K27ac GSE117409, nascent RNA transcription GSE107633

  • 3D genomics of glioblastoma. Replicate samples from three patients. Sub-5kb-resolution Hi-C data, integration with ChIP- and RNA-seq. Data: Six Hi-C replicates, EGAS00001003493, ChIP-seq GSE121601, RNA-seq data EGAS00001003700. Processed data

  • Ten non-replicated Hi-C datasets. Two human lymphoblastoid cell lines with known chromosomal translocations (FY1199 and DD1618), transformed mouse cell line (EKLF), six human brain tumours: five glioblastomas ( GB176, GB180, GB182, GB183 and GB238) and one anaplastic astrocytoma (AA86), a normal human cell line control (GM07017). GSE81879

  • Harewood, Louise, Kamal Kishore, Matthew D. Eldridge, Steven Wingett, Danita Pearson, Stefan Schoenfelder, V. Peter Collins, and Peter Fraser. “Hi-C as a Tool for Precise Detection and Characterisation of Chromosomal Rearrangements and Copy Number Variation in Human Tumours.” Genome Biology 18, no. 1 (December 2017).

  • Prostate cancer, normal. RWPE1 prostate epithelial cells transfected with GFP or ERG oncogene. Two biological and up to four technical replicates. GSE37752

    • Rickman, David S., T. David Soong, Benjamin Moss, Juan Miguel Mosquera, Jan Dlabal, Stéphane Terry, Theresa Y. MacDonald, et al. “Oncogene-Mediated Alterations in Chromatin Conformation.” Proceedings of the National Academy of Sciences of the United States of America 109, no. 23 (June 5, 2012)
  • Taberlay, Phillippa C., Joanna Achinger-Kawecka, Aaron T. L. Lun, Fabian A. Buske, Kenneth Sabir, Cathryn M. Gould, Elena Zotenko, et al. “Three-Dimensional Disorganization of the Cancer Genome Occurs Coincident with Long-Range Genetic and Epigenetic Alterations.” Genome Research 26, no. 6 (June 2016)

  • Cancer, normal Hi-C. Prostate epithelial cells, PC3, LNCaP. Two-three replicates. GSE73785

  • Haplotype-resolved Hi-C of GM12878, integrated with RNA-seq and Bru-seq (nascent mRNA). Investigation of Monoallelic expression (MAE) and Allele-Biased expression (ABE). GEO GSE159813

BRCA

  • 3D genome reorganization during breast cancer progression from a nonmalignant state (MCF10A cells, "10A") to premalignant (MCF10AT1, "T1") and malignant (MCF10Ca1a, "C1") states. Hi-C combined with lamina-associated domains, epigenomic marks, gene expression. TADs are stable, but compartments and subcompartments switch and these are associated with gene expression. MYC is amplified and inserted into a highly active subcompartment on chromosome 10. TAD clique analysis shows changes in clique size and frequency, association with B-type cubcompartments. Chrom3D genome genome modeling, integrating with nuclear lamina interactions (LADs, Lamin B1 ChIP-seq), major differences between 10A vs. T1 and C1 cells. Improvement of many tools. Code. Data: GSE246689 - processed gene expression; GSE246599 - BED files of Lamin B1 ChIP-seq; GSE247171 - reanalysis of public datasets; GSE246947 - domains, subcompartments, TADs, cliques, .hic and .mcool files.
    Paper Rossini, Roberto, Mohammadsaleh Oshaghi, Maxim Nekrasov, Aurélie Bellanger, Yasmin Dijkwel, Mohamed Abdelhalim, Philippe Collas, and Jonas Paulsen. “Loss of Multi-Level 3D Genome Organization during Breast Cancer Progression,” bioRxiv. Preprint. 2023 Nov 27 https://doi.org/10.1101%2F2023.11.26.568711
  • Hi-C profiling of 12 breast tissue samples - 2 normal, 5 ER+ tumors and 5 ER+ tamoxifen-treated tumors. Compartments are largely preserved, TADs and loops are heterogeneous. Very few common pathways associated with differential TADs/loops. Functional enrichment, survival analysis of genes lost and gained between different conditions. CA2 gene within the bicarbonate transport metabolism pathway as the driver of tamoxifen resistance, its inhibition (brinzolamide) impedes tumor growth and reverses chromatin looping. HiC-Pro for processing, dcHiC for A/B compartment analysis, TopDom for TAD calling, Group-, Individual-sample Specific TADs (GISTA) algorithm for TAD comparison (conserved, moderately- and significantly variable TADs), FitHiC2 for loops, HiNT-CNV for CNV detection. Python/R scripts on GitHub.
    Paper Lavanya Choppavarapu, Kun Fang, Tianxiang Liu, and Victor X. Jin. “Hi-C Profiling in Tissues Reveals 3D Chromatin-Regulated Breast Tumor Heterogeneity and Tumor-Specific Looping-Mediated Biological Pathways.” bioRxiv, January 1, 2024, 2024.03.13.584872. https://doi.org/10.1101/2024.03.13.584872.
  • Nucleosome reorganization in breast cancer vs. normal tissues (MNase-seq, MNase-H3-seq), along with cfDNA from blood. Four patients. Data processing with cfDNAtools, NucTools. Nucleosomes gained in BRCA are strongly enriched (20X) in CpG islands, in promoters of DNA-binding proteins, cancer pathways. Average distance between nucleosomes (Nucleosome repeat length NRL) decreases (5-10bp). These effects are associated with differential DNA methylation and binding of linker histone variants H1.4 and H1X.
    Paper Jacob, Divya R., Wilfried M. Guiblet, Hulkar Mamayusupova, Mariya Shtumpf, Luminita Ruje, Isabella Ciuta, Svetlana Gretton, et al. “Nucleosome Reorganisation in Breast Cancer Tissues.” Preprint. Genomics, April 18, 2023. https://doi.org/10.1101/2023.04.17.537031.
  • Comparative characterization of 3D genomics in TNBC. Cell lines (HMEC as normal and 5 BRCA subtypes, by the order of aggressiveness: T47D, ZR7530, HCC1954, HCC70, BT549). TNBC shows most dramatic changes, partially conserved across TNBC cell lines and TNBC tissues. TADs (CaTCH), loops (HiCCUPS), compartment (PC1) analyses. Local interactions are lost, "normal" TAD interactions weakened but TNBC TADs strenghtened; those changes are associated with CTCF loss/gain. 3D changes are associated with gene expression changes. Hi-C (replicates), ChIP-seq (CTCF, H3K27ac), RNA-seq, and ATAC-seq data are at GSE167154.
    Paper Kim, Taemook, Sungwook Han, Yujin Chun, Hyeokjun Yang, Hyesung Min, Sook Young Jeon, Jang-il Kim, Hyeong-Gon Moon, and Daeyoup Lee. “Comparative Characterization of 3D Chromatin Organization in Triple-Negative Breast Cancers.” Experimental & Molecular Medicine, May 5, 2022. https://doi.org/10.1038/s12276-022-00768-2.
  • 3D spheroids (organoids) of three breast normal (MCF10A) and cancer cells (MCF7 and MCF7TR tamoxifen-resistant). Hi-C, RNA-seq, validation using 3D-qPCR, 3D-FISH. Normalization using HiCcompare's idea, TADs using TopDom, TAD comparison using eight types of changes, significant interactions using HiSIF. P1D1 loop definition as loops contacting promoter and distal regions of the same gene, comparison of strength change using Valid Pairs Per Million (VPPM), defining differentially expressed looping genes (DELGs). Hi-C (replicates) and RNA-seq (triplicates) at GSE165572.
    Paper Li, Jingwei, Kun Fang, Lavanya Choppavarapu, Ke Yang, Yini Yang, Junbai Wang, Ruifeng Cao, Ismail Jatoi, and Victor X. Jin. “Hi-C Profiling of Cancer Spheroids Identifies 3D-Growth-Specific Chromatin Interactions in Breast Cancer Endocrine Resistance.” Clinical Epigenetics 13, no. 1 (December 2021): 175. https://doi.org/10.1186/s13148-021-01167-6.
  • BRCA gene targets regulated by SNPs - Capture-C of chromatin interactions centered on causal variants and promoters of causal genes (Variant- and Promoter Capture Hi-C) in six human mammary epithelial (B80T5, MCF10A) and breast cancer (MCF7, T47D, MDAMB231, Hs578T) cell lines. HindIII fragments, CHiCAGO and Peaky for significant interaction calling. PCA on interactions separates cell types, significant interactions enriched in epigenomic elements. 651 target genes at 139 independent breast cancer risk signals. Table 1 - top priority target genes. HiCUP-processed capture Hi-C data (hg19), code, Supplementary tables, Tables S11 - 651 target genes.
    Paper Beesley, Jonathan, Haran Sivakumaran, Mahdi Moradi Marjaneh, Luize G. Lima, Kristine M. Hillman, Susanne Kaufmann, Natasha Tuano, et al. “Chromatin Interactome Mapping at 139 Independent Breast Cancer Risk Signals.” Genome Biology 21, no. 1 (December 2020) https://doi.org/10.1186/s13059-019-1877-y
  • Hi-C and RNA-seq in two ERα+ parental and Tamoxifen-resistant (TR) MCF7 and T47D cells, before and after treatment with Sapitinib (AZD8931), a dual TKI of EGFR/HER2. Eight types of TAD changes (TopDom), significant loops using Homer, promoter-distal looping genes (P1D1, P1D2). Many TR-specific TADs and loops are reversible upon Sapitinib treatment. ERα-bound promoter-enhancer looping genes enclosed within altered domains are enriched with genes with functions and pathways associated with cancer aggressiveness, glycolysis and metabolism, and focal adhesion. Comparing cells and spheroids - the latter recapitulate most changes and better preclinical model. hg19, 40kb. Replicated Hi-C and triplicated RNA-seq of MCF7/T47D parental/TamR at GSE144380 and GSE128676.
    Paper Yang, Yini, Lavanya Choppavarapu, Kun Fang, Alireza S. Naeini, Bakhtiyor Nosirov, Jingwei Li, Ke Yang, et al. “The 3D Genomic Landscape of Differential Response to EGFR/HER2 Inhibition in Endocrine-Resistant Breast Cancer Cells.” Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms 1863, no. 11 (November 2020): 194631. https://doi.org/10.1016/j.bbagrm.2020.194631.
  • 3D (tethered chromatin conformation, TCC) timecourse of estradiol (E2) simulation in ER+ BRCA and endocrine resistance. Hormone-starved MCF7 (T0), E2-treated for 1h (T1), replicates combined. Approximately similar number of compartments (2050). Dynamic A/B compartments (HiCLib) are associated with active open chromatin. Dynamic changes are characterized by decreased CTCF binding. Associated genes enriched with cancer invasion, aggressiveness, metabolism. Three additional timepoints, 4h, 16h, 24h (T4, T16, T24). 24 patterns of changes, categorized into six (similar to TADcompare, highly common HCC, early/late transit ETC/LTC, lowly/moderately/highly dynamic LDC/MDC/HDC). Epigenetic states from histone ChIP-seq ChromHMM. Public RNA-seq data. Tamoxifen-resistant MCF7-TamR and T47D-TamR cell lines, tamoxifen-resistant altered compartments (TRACs), six types classified into shrunk, expanded, and flipped compartments. HOMER-identified loops. Differential genes associated with ribosome, tight junction, endocytosis, lysosome, cell cycle, WNT signaling pathway, insulin signaling pathway, focal adhesion, and MAPK signaling pathways. Molecular mechanistic model in Discussion. Supplementary data with hg19 coordinates of compartments, genes, loops. GSE108787 - MCF7 and TamR TCC, ChIP-seq and RNA-seq timecourse data (plus public RNA-seq); GSE119890 - T47D and TamR TCC timecourse data.
    Paper Zhou, Yufan, Diana L. Gerrard, Junbai Wang, Tian Li, Yini Yang, Andrew J. Fritz, Mahitha Rajendran, et al. “Temporal Dynamic Reorganization of 3D Chromatin Architecture in Hormone-Induced Breast Cancer and Endocrine Resistance.” Nature Communications 10, no. 1 (December 2019): 1522. https://doi.org/10.1038/s41467-019-09320-9.
  • Capture Hi-C (CHi-C) to annotate 63 breast cancer risk loci. 110 target genes at 33 loci, supported bu other evidence (eQTLs, disease-specific survival). Two ER+ breast cancer cell lines (T-47D, ZR-75-1), two ER− breast cancer cell lines (BT-20, MDA- MB-231), one “normal” breast epithelial cell line (Bre80-Q-TERT (Bre80)) and a non-breast lymphoblastoid cell line (GM06990). Approx 40% of interaction peaks are present in multiple cell lines. More interactions within TADs. WashU session with all CHi-C interaction peaks. Table 2 Risk loci which formed interaction peaks directly (N = 33) or via an adjacent risk locus (N = 3) with 110 target genes (locus, SNP, gene targets, nearest gene). Table 3 Nine CHi-C putative target genes that were statistically significant eQTLs (FDR adjusted P < 0.1) (locus, SNP, gene, p-values in all, ER+/- cancers). Table 4 Six CHi-C putative target genes for which there was orthogonal support for at least two additional data sources. PRJEB23968 - FASTQ files.
    Supplementary material https://www.nature.com/articles/s41467-018-03411-9#Sec23 - Supplementary Data 1: Captured genomic regions (Locus, SNP, hg19 coordinates, size, reference) - Supplementary Data 2: Numbers of statistically significant interaction peaks in six cell lines at 51 informative loci and 12 uninformative loci - Supplementary Data 3: Coordinates of interacting pairs detected in at least two cell lines (bedpe, -log10 FDR of interaction significance, cell line, numbed of cells) - Supplementary Data 4: Risk loci which formed interaction peaks with target genes in T-47D (T), ZR-75-1 (Z), Bre80 (Br), BT-20 (BT), MDA-MB-231 (M) and GM06990 (G) cell lines. (cytoband, SNP, gene targets). - Supplementary Data 5: Distances between published risk SNPs and putative CHi-C target genes (kb) at 36 informative risk loci (cytoband, SNP, hg19 coordinates, gene targets) - Supplementary Data 6: eQTL analysis of 69 protein coding target genes at 26 risk loci in TCGA breast cancer data - Supplementary Data 7: Disease-specific survival analysis of 97 target genes in Metabric data
    Paper Baxter, Joseph S., Olivia C. Leavy, Nicola H. Dryden, Sarah Maguire, Nichola Johnson, Vita Fedele, Nikiana Simigdala, et al. “Capture Hi-C Identifies Putative Target Genes at 33 Breast Cancer Risk Loci.” Nature Communications 9, no. 1 (December 2018): 1028. https://doi.org/10.1038/s41467-018-03411-9

Tissue-specific

ENCODE

Search query for any type of Hi-C data, e.g., human brain Hi-C

Brain

Cell lines

  • Haarhuis, Judith H.I., Robin H. van der Weide, Vincent A. Blomen, J. Omar Yáñez-Cuna, Mario Amendola, Marjon S. van Ruiten, Peter H.L. Krijger, et al. “The Cohesin Release Factor WAPL Restricts Chromatin Loop Extension.” Cell, (May 2017) - WAPL, cohesin's antagonist, DNA release factor, restricts loop length and prevents looping between incorrectly oriented CTCF sites. Together with SCC2/SCC4 complex, WAPL promotes correct assembly of chromosomal structures. WAPL WT and KO Hi-C, RNA-seq, ChIP-seq for CTCF and SMC1. Also, SCC4 KO and combined SCC4-WAPL KO Hi-C. Potential role of WAPL in mitosis chromosome condensation. Tools: HiC-Pro processing, HICCUPS, HiCseq, DI, SomaticSniper for variant calling. Data (Hi-C in custom paired BED format) : GEO GSE95015

  • Grubert, Fabian, Judith B. Zaugg, Maya Kasowski, Oana Ursu, Damek V. Spacek, Alicia R. Martin, Peyton Greenside, et al. “Genetic Control of Chromatin States in Humans Involves Local and Distal Chromosomal Interactions.” Cell, (August 2015) - seven Hi-C replicates on Gm12878 cell line, GEO GSE62742

  • Naumova, Natalia, Maxim Imakaev, Geoffrey Fudenberg, Ye Zhan, Bryan R. Lajoie, Leonid A. Mirny, and Job Dekker. “Organization of the Mitotic Chromosome.” Science (New York, N.Y.), (November 22, 2013) - E-MTAB-1948 - 5C and Hi-C chromosome conformation capture study on metaphase chromosomes from human HeLa, HFF1 and K562 cell lines across the cell cycle. Two biological and two technical replicates. ArrayExpress E-MTAB-1948

  • Jessica Zuin et al., “Cohesin and CTCF Differentially Affect Chromatin Architecture and Gene Expression in Human Cells,” Proceedings of the National Academy of Sciences of the United States of America, (January 21, 2014) - CTCF and cohesin (RAD21 protein) are enriched in TAD boundaries. Depletion experiments. Different effect on inter- and intradomain interactions. Loss of cohesin leads to loss of local interactions, but TADs remained. Loss of CTCF leads to both loss of local and increase in inter-domain interactions. Different gene expression changes. TAD structures remain largely intact. Data: Hi-C, RNA-seq, RAD21 ChIP-seq for control and depleted RAD21 and CTCF in HEK293 hepatocytes. Two replicates in each condition. GEO GSE44267

Non-human data

  • The effect of somatic chromosome pairing on 3D genome organization. Drosophila, in-situ Hi-C data, HiC-Pro, Juicer processing. Investigation of the effect of paiting on gene loops mediated by RNAPII and Polycomb-mediated loops. Maintenance of A/B compartments is independent from looping. Anti-pairing CAP-H2-condensin II complex interacts with the zinc-finger protein Z4, under hyperosmotic cellular stress. Informative introduction about the 3D Drosophila genome. ChIP-seq, Hi-C, and ATAC-seq data (dm6 assembly) on GEO GSE213553. Other data in the "Data availability" section.
    Paper Puerto, Marta, Mamta Shukla, Paula Bujosa, Juan Perez-Roldan, Srividya Tamirisa, Carme Solé, Eulàlia de Nadal, Francesc Posas, Fernando Azorin, and M. Jordan Rowley. “Somatic Chromosome Pairing Has a Determinant Impact on 3D Chromatin Organization.” Preprint. Genomics, March 30, 2023. https://doi.org/10.1101/2023.03.29.534693.
  • Evolutionary 3D genomics, principles of chromosome folding in mammals (Eutherians (aardvark, elephant, mouse, human), marsupials (chicken, platypus, wallaby, tasmanian devil)). Reshuffling can influence high-order chromatin organization. Eutherian genome organization is associated with higher number of short loops (Hi-C), high CTCF density (ChIP-seq), chromosomal territories. Vice versa for marsupials, including chromosomes in the Rabl configuration. A/B compartments, TADs have similar properties. Analysis of synthenic region rearrangements, reconstructing evolutionary history. Juicer, TADbit, FAN-C, Newly generated data for African Elephant, Aardvark, Tasmanian Devil, Tammar Wallaby (Hi-C, CTCT, H3K4me3 ChIP-seq, RNA-seq) at GSE206075.
    Paper Álvarez-González, Lucía, Cristina Arias-Sardá, Laia Montes-Espuña, Laia Marín-Gual, Covadonga Vara, Nicholas C. Lister, Yasmina Cuartero, et al. “Principles of 3D Chromosome Folding and Evolutionary Genome Reshuffling in Mammals.” Cell Reports 41, no. 12 (December 2022): 111839. https://doi.org/10.1016/j.celrep.2022.111839.
  • Erythrocytes 3D genome organization in ten species at the last nucleated stages of maturation (newly generated mouse erythroblasts data and previously generated public blood Hi-C data from other organisms). Lack loops and TADs, strong second diagonal pattern. Raw data at SRA.
    Paper Ryzhkova, Anastasia, Alena Taskina, Anna Khabarova, Veniamin Fishman, and Nariman Battulin. “Erythrocytes 3D Genome Organization in Vertebrates.” Scientific Reports 11, no. 1 (December 2021): 4414. https://doi.org/10.1038/s41598-021-83903-9.
  • Investigation of the mechanisms of TAD boundaries in Drosophila. Notch gene locus having two TADs, the role of genetic sequences bound by architectural proteins (APs, CP190, BEAF-32, M1BP, SuHw, CTCF). Deletion (CRISPR-Cas9) of domains lead to fusion of TADs, loss of APs, disruption of transcription. In nucleus Hi-C (4-cutter MboI) in embryonic cell line S2R+ in triplicates GSE136137. References to many Drosophila public datasets in Methods section.
    Paper Arzate-Mejía, Rodrigo G., Angel Josué Cerecedo-Castillo, Georgina Guerrero, Mayra Furlan-Magaril, and Félix Recillas-Targa. “In Situ Dissection of Domain Boundaries Affect Genome Topology and Gene Transcription in Drosophila.” Nature Communications 11, no. 1 (December 2020): 894. https://doi.org/10.1038/s41467-020-14651-z.
  • RNA-seq, ATAC-seq, ChIP-seq, whole genome methylation (30X), Hi-C in 11 adult and two embryonic tissues on zebrafish. Comparison with human and mouse regulatory elements. Enrichment of evolutionary breakpoints at TAD boundaries, H3K4me3 and CCTF signal.De novo chr4 assembly (sex chromosome). scATAC-seq on zebrafish brain - 25 cell types. GEO GSE134055, Tweet

  • tagHi-C protocol for low-input tagmentation-based Hi-C. Applied to mouse hematopoiesis 10 major blood cell types. Changes in compartments and the Rabl configuration defining chromatin condensation. Gene-body-associating domains are a general property of highly-expressed genes. Spatial chromatin loops link GWAS SNPs to candidate blood-phenotype genes. HiC-Pro to Juicer. GEO GSE142216 - RNA-seq, replicates, GEO GSE152918 - tagHi-C data, replicates, combined .hic files

  • Single-nucleus Hi-C data (scHi-C) of 88 Drosophila BG3 cells. 2-5M paired-end reads per cell, 10kb resolution. ORBITA pipeline to eliminate the effect of Phi29 DNA polymerase template switching. Chromatin compartments approx. 1Mb in size, non-hierarchical conserved TADs can be detected. Lots of biology, integration with other omics data. Raw and processed data in .cool format at GEO GSE131811

  • 3D chromatin organization during spermatogenesis, mouse. Meyotic chromosomes in prophase have weak compartmentalization, TADs, loops. Enrichment in near inter-chromosomal interactions (close to diagonal). The X chromosome lacks domain organization during meiotic sex-chromosome inactivation. Concept and formula for evaluation of genomic compartment strength (Methods). GEO - Hi-C of meiotic pachytene spermatocytes (PS; 2 biological replicates). Other public Hi-C, RNA-seq, ChIP-seq data.

  • 3D genome rearrangement is uncoupled from gene expression changes. Introduction, references for and against 3D genomics-gene expression links. Drosophila, a "balancer" line with highly rearranged chromosomes. Negligible association can be detected, but changes in genome topology are not predictive of changes in gene expression, loss of long-range interactions has little impact. Processed data, GitHub. Raw data: Whole genome, Hi-C, Capture-C, RNA-seq

    Paper

    Ghavi-Helm, Yad, Aleksander Jankowski, Sascha Meiers, Rebecca R. Viales, Jan O. Korbel, and Eileen E. M. Furlong. “Highly Rearranged Chromosomes Reveal Uncoupling between Genome Topology and Gene Expression.” Nature Genetics, July 15, 2019.

  • Global organization of the B cell genome throughout differentiation by the transcription factor Pax5. Mouse splenic CD4+ cells, B cells at various differentiation stages, granulocytes. diffHiC, TADbit, directionality index. Hi-C and RNA-seq data on GEO GSE99163.
    Paper Johanson, Timothy M. “Transcription-Factor-Mediated Supervision of Global Genome Architecture Maintains B Cell Identity.” Nature Immunology 19 (2018): 14. https://doi.org/10.1038/s41590-018-0234-8
  • TADs in Drosophila, Hi-C and RNA-seq in four cell lines of various origin. dCTCF, SMC3, and Su(Hw) are weakly enriched at TAD boundaries. Transcription and active chromatin (H3K27ac, H3K4me1, H3K4me3, H3K36me3, H4K16ac) are associated with TAD boundaries. Also, BEAF-32 and CP190. Hierarchical TADs. Housekeeping genes tend to be near TAD boundaries and in inter-TAD regions. TAD boundary prediction using regression, modeling to associate TADs with bands, investigation of the hierarchy. Heavy use of the Armatus TAD caller. RNA-seq and replicate Hi-C data, high correlation, merged into 20kb resolution.  GEO GSE69013

  • Hi-C of polytene chromosomes in Drosophila. Polytene bands colocalize with TADs. TADs are conserved between polytene and diploid cells. Loops are transient. Two states of folding: Fully extended and up to 10-fold compacted fibers constitute euchromatin. Up to 30-fold compacted fibers represent heterochromatin of the nuclear periphery. Many experimental observations, validations. GEO - Tethered and in-solution Hi-C, triplicates, polytene, diploid.

    Paper Eagen, Kyle P., Tom A. Hartl, and Roger D. Kornberg. “Stable Chromosome Condensation Revealed by Chromosome Conformation Capture.” Cell 163, no. 4 (November 2015): 934–46. https://doi.org/10.1016/j.cell.2015.10.026.

Differential Hi-C

  • Liquid-liquid phase separation (LLPS) in haematological cancers is associated with intrinsically disordered regions (IDRs) of NUP98-HOXA TF chimera and induces CTCF-independent chromatin loops enriched in proto-oncogenes. Many biochemical assays, imaging, mass-spec, ChIP-seq, RNA-seq. All data at GEO GSE144643. In situ Hi-C (HEK293FT kidney cells, IDR wild type and mutated, biological and technical replicates) at GEO GSE143465.
    Paper Ahn, Jeong Hyun, Eric S. Davis, Timothy A. Daugird, Shuai Zhao, Ivana Yoseli Quiroga, Hidetaka Uryu, Jie Li, et al. “Phase Separation Drives Aberrant Chromatin Looping and Cancer Development.” Nature, June 23, 2021. https://doi.org/10.1038/s41586-021-03662-5.
  • WIZ (widely interspaced zinc finger-containing protein) - new loop-organizing protein, colocalizes with CTCF and cohesin across the genome. Loss of WIZ increases cohesin occupancy and DNA loops. WIZ maintains proper gene expression and stem cell identity. Arima, Juicer. GEO GSE137285 - RNA-seq, ChIP-seq, Hi-C replicates in WT and WIZdel mouse ESCs.

  • 3D chromatin reorganization during different types of cellular senescence, replicative (RS) and oncogene-induced (OIS over time course). Senescence-associated heterochromatin loci (SAHFs), formed with the help of DNMT1 via regulation of MMGA2 expression. WI38 primary fibroblasts. OIS - gain in long-range contacts. diffHiC analysis, differential regions enriched in H3K9me3. TADkit for 3D modeling, visualization. Data (Hi-C replicates, different conditions, timecourse, H3K4me3/H3K9me3/H3K27ac ChIP-seq, RNA-seq) GEO GSE130306

  • X chromosome sex differences in Drosophila. Male X chromosome has two-fold upregulation of gene expression, more mid/long-range interactions, weaker boundaries marked by BEAF-32, CP190, Chromator, and CLAMP, a dosage compensation complex cofactor. Less negative slope in distance-dependent decay of interactions, less clustered top scoring interactions (more randomness), more open structure overall. Local score differentiator (LSD-score) to call differential TAD boundaries in CNV-independent manner - more non-matching boundaries than autosomes, ~20% appearing and ~35% disappearing boundaries. Enrichment in epigenomic marks identified stronger boundary association with MSL (male-specific lethal complex) and CLAMP binding. Many other experimental observations. hiclib, hicpipe processing. R implementation of LSD differential TAD analysis, Hi-C data in bedGraph format GEO GSE94115, Tweet

  • Hi-C TAD comparison between normal prostate cells (RWPE1) and two prostate cancer cells (C42B, 22Rv1). TADs (TopDom-called) become smaller in cancer, switch epigenetic states. FOXA1 promoter has more loop anchors in cancer. Androgen receptor (AR) locus has chromatin structure changed around it (Figure 6). Loop investigation called with Fit-HiC, motifs (NOMe-seq) enriched in loop-associated enhancers different between normal and cancer. HiTC visualization. Figure 1a, Supplementary Figure 3, 5 - examples/coordinates of TAD boundary/length changes.

  • Data For RWPE1, C42B, 22Rv1 cell lines: GEO GSE118629. In situ Hi-C, 4-cutter MboI,  replicated, text-based sparse matrices at 10kb and 40kb resolution, raw and ICE-normalized, hg19. H3K9me3, H3K27me3, H3K36me3, RNA-seq.

  • Supplementary data: Data 2 - TAD coordinates and annotations; Data 3 - differentially expressed genes in smaller TADs; Data 4 - gene expression changes in TADs switching epigenomic state; Data 5 - enhancer-promoter loops; Data 6 - coordinates of nucleosome-depleted regions; Data 7 - all differentially expressed genes; Data 8 - target genes of FOXA1-bound enhancers; Data 9 - overexpressed genes with more enhancer-promoter loops

  • DNA methylation linked with 3D genomics. Methylation directs PRC-dependent 3D organization of mouse ESCs. Hypomethylation in mouse ESCs driven to naive pluripotency in two inhibitors (2i) is accopmanied by redistribution of polycomb H3K27me3 mark and decompaction of chromatin. Focus on HoxC, HoxD loci. Hi-C data processed with distiller and other cool-related tools. RNA-seq, H3K37me3 ChIPseq of Mouse ESCs grown in serum and 2i conditions. Hi-C data in replicates GEO GSE124342

  • RNA transcription inhibition minimally affects TADs, weakens TAD boundaries. K562, RNAse inhibition before/after crosslinking (bXL/aXL), actinomycin D (complete transcriptional arrest) treatment. Processing using cword, 40kb resolution. Data with replicates of each condition, GEO GSE114337

  • Comparison of the 3D structure of human and chimpanzee induced puripotent stem cells. Lower-order pairwise interactions are relatively conserved, but higher-order, such as TADs, differ. HiCUP and HOMER for Hi-C data processing to 10kb resolution. cyclic loess normalization, limma for significant interaction definition, Arrowhead on combined replicated wot detect TADs.  Association of differential chromatin interactions with gene expression. PyGenomeTracks for visualization. Workflowr code, Processed Hi-C data (4 human and 4 chimp iPSCs) GEO GSE122520

  • In situ HiC libraries in biological replicates (n=2) for several hematopoietic celltypes (200mio reads per replicate) with a focus on the B cell lineage in mice. The authors investigate the role of the transcription factor Pax5 towards its supervisiory role of organizing the 3D genome architecture throughout B cell differentiation. The raw data are available via GEO GSE99151

  • DNA loop changes during macrophage development (THP-1 monocyte to macrophage development under 72h PMA treatment). In situ Hi-C (pbn reads, 10kb resolution), RNA-seq, ATAC-seq, CTCF and H3K27ac ChIP-seq. Formation of multi-hubs at key macrophage genes. Differential (dynamic, DESeq2-detected) loops are enriched for AP-1, more enriched in H3K27ac, in contrast to static loops. Association between local H3K27ac and transcription level with distal DNA elements with elevated H3K27ac. Very few genes and lower H3K27ac signal in lost loops, more genes and H3K27ac signal in gained loops. Fold changes in H3K27ac signal positively correlate with DNA looping. Macrophage development-specific gene ontology enrichments. Network analysis for multi-loop multi-enhancer activation hubs identification. GEO GSE96800 ChIP-seq, ATAC-seq, RNA-seq, Two Hi-C samples, THP-1 PMA-treated and untreated, SRA PRJNA385337.

    • Supplemental material:
      • Table S1. DNA Loops in Untreated THP-1 Cells, 16067. Text, hg19 genomic coordinates, columns: anchor1_chrom anchor1_start anchor1_end anchor2_chrom anchor2_start anchor2_end sample -log10(P) anchor1_strand anchor2_strand
      • Table S2. DNA Loops in PMA-Treated THP-1 Cells, 16335.
      • Table S3. Differential Loops
    • Phanstiel, Douglas H., Kevin Van Bortle, Damek Spacek, Gaelen T. Hess, Muhammad Saad Shamim, Ido Machol, Michael I. Love, Erez Lieberman Aiden, Michael C. Bassik, and Michael P. Snyder. “Static and Dynamic DNA Loops Form AP-1-Bound Activation Hubs during Macrophage Development.” Molecular Cell, (September 2017)

Timecourse Hi-C

  • 3D genomics of human embryogenesis. Human and mouse sperm differ, human don't have TADs and A/B compartments, they establish later in embryogenesis, require zygotic genome activation and CTCF. Six stages of spatiotemporal Hi-C during human embryogenesis including sperm, 2-cell, 8-cell, morula, blastocysts, and six-week-old embryos. GitHub. Data: CRA000852, CRA000108, CRA000852.
    Paper Chen, Xuepeng, Yuwen Ke, Keliang Wu, Han Zhao, Yaoyu Sun, Lei Gao, Zhenbo Liu, et al. “Key Role for CTCF in Establishing Chromatin Structure in Human Embryos.” Nature, December 4, 2019. https://doi.org/10.1038/s41586-019-1812-0.
  • Vara, Covadonga, Andreu Paytuví-Gallart, Yasmina Cuartero, François Le Dily, Francisca Garcia, Judit Salvà-Castro, Laura Gómez-H, et al. “Three-Dimensional Genomic Structure and Cohesin Occupancy Correlate with Transcriptional Activity during Spermatogenesis.” Cell Reports, (July 2019) - 3D structure changes during spermatogenesis in mouse. Hi-C, RNA-seq, CTCF/REC8/RAD21L ChIP-seq. Description of biology of each stage (Fibroblasts, spermatogonia, leptonema/zygonema, pachynema/diplonema, round spermatids, sperm), and A/B compartment and TAD analysis (TADbit, insulation score), data normalized with ICE. Integration with differential expression. Changes in distribution of CTCF and cohesins (REC8 and RAD21L). Key tools: BBDuk (BBMap), TADbit, HiCExplorer, HiCRep, DeepTools. Data (no replicates) GEO GSE132054

  • Paulsen, Jonas, Tharvesh M. Liyakat Ali, Maxim Nekrasov, Erwan Delbarre, Marie-Odile Baudement, Sebastian Kurscheid, David Tremethick, and Philippe Collas. “Long-Range Interactions between Topologically Associating Domains Shape the Four-Dimensional Genome during Differentiation.” Nature Genetics, April 22, 2019 - Long-range TAD-TAD interactions form cliques (>3 TAD interacting) are enriched in B compartments and LADs, downregulated gene expression. Graph representation of TAD interactions. Quantifying statistical significance of between-TAD interactions. TAD boundaries are conserved. TAD cliques are dynamic. Permutation test preserving distances. Armatus for TAD detection. hiclib for data processing, Juicebox for visualization. Data: Time course differentiation or human adipose stem cells (day 0, 1, and 3). Hi-C (two replicates), Lamin B1 ChIP-seq, H3K9me3. GEO GSE109924. Also used mouse ES differentiation (Bonev 2017), mouse B cell reprogramming (Stadhouders 2018), scHi-C (Nagano 2017)

  • Du, Zhenhai, Hui Zheng, Bo Huang, Rui Ma, Jingyi Wu, Xianglin Zhang, Jing He, et al. “Allelic Reprogramming of 3D Chromatin Architecture during Early Mammalian Development.” Nature, (12 2017) - Developmental time course Hi-C. Data in preimplantation embryos at the following stages: gametes (sperm and MII oocyte), pronuclear stage 5 (PN5) zygotes, early 2-cell, late 2-cell, 8-cell, inner cell masses (ICM), and mouse embryonic stem cells (mES). Low-input Hi-C technology (sisHi-C). TADs are initially absent, then gradually appeared. HiCPro mapping, Pearson correlation on low-resolution matrices, allele resolving. Data:  GEO GSE82185

  • Hug, Clemens B., Alexis G. Grimaldi, Kai Kruse, and Juan M. Vaquerizas. “Chromatin Architecture Emerges during Zygotic Genome Activation Independent of Transcription.” Cell, (06 2017) - TADs appearing during zygotic genome activation, independent of transcription. TAD boundaries are enriched in housekeeping genes, colocalize in 3D. Drosophila. Insulation score for boundary detection. Overlap analysis of TAD boundaries. Processed Hi-C matrices at 5kb resolution (replicates merged, .cool format) and TAD boundaries at nuclear cycle 12, 13, 14, and 3-4 hours post fertilization

  • Ke, Yuwen, Yanan Xu, Xuepeng Chen, Songjie Feng, Zhenbo Liu, Yaoyu Sun, Xuelong Yao, et al. “3D Chromatin Structures of Mature Gametes and Structural Reprogramming during Mammalian Embryogenesis.” Cell, (July 13, 2017) - 3D timecourse changes during mouse gametes (sperm and MII oocyte) and early embryos development, from zygotic (no TADs, many long-range interactions) to 2-, 4-, 8-cell, blastocyst and E7.5 mature embryos (TADs established after several rounds of DNA replication). A/B compartments associated with un/methylatied CpGs, respectively. PC1, directionality index, insulation score to define compartments and TADs, these metrics increase in magnitude/strength during maturation. Enrichment in CTCF, SMC1, H3K4me3, H3K27ac, H3K9ac, H3K4me1, depletion in H3K9me3, H3K36me3, H3K27me3. The compartment strength is weaker in maternal vs. paternal genomes. Covariance for each gene vs. boundary score across the timecourse. Relative TAD intensity changes. Hi-C and RNA-seq data at different stages, some replicates

Capture Hi-C

  • Loop Catalog - HiChIP loop calls for 1319 samples across 133 studies and 44 high-resolution Hi-C loop calls. Uniform processing with HiC-Pro, loops calles with the PeakInferHiChIP.sh utility function of FitHiChIP. Some samples have associated ChIP-seq data. Motif enrichment analysis, region interaction network analysis. Integrated GWAS variants from CAUSALdb (SNP-to-gene linking, SGL). WashU-like visualization.
    Paper Reyna, Joaquin, Kyra Fetter, Romeo Ignacio, Cemil Can Ali Marandi, Nikhil Rao, Zichen Jiang, Daniela Salgado Figueroa, Sourya Bhattacharyya, and Ferhat Ay. “Loop Catalog: A Comprehensive HiChIP Database of Human and Mouse Samples.” bioRxiv, January 1, 2024, 2024.04.26.591349. https://doi.org/10.1101/2024.04.26.591349.
  • SIPs, super-interactive promoters in five hematopoietic cell types (Erythrocyte, Macrophage/monophage, megakaryocyte, naive CD4 T-cells, Neutrophils). Reanalysis of promoter-capture Hi-C data from Javierre et al., “Lineage-Specific Genome Architecture Links Enhancers and Non-Coding Disease Variants to Target Gene Promoters.” study. CHiCAGO pipeline. Promoter-interacting regions (PIRs) interacting with SIPs are more enriched in cell type-specific ATAC-seq peaks, GWAS variants for relevant cell types. SIP-associated genes are higher expressed in relevant cells. Some SIPs are shared across cell lines. Super-SIPs.

  • Genome-wide maps linking disease variants to genes. Activity-By-Contact (ABC) Model. 72 diseases and complex traits (non-specific, no psychiatric), linking 5046 fine-mapped GWAS signals to 2249 genes. 577 genes influence multiple phenotypes. Nearly half enhancers regulate multiple genes.Table S7 - Summary of diseases and traits.Table S9 - ABC-Max predictions for 72 diseases and complex traits.

  • Promoter-enhancer contacts occur in cohesin-dependent and cohesin-independent manner. Promoter Capture Hi-C on degradation of cohesin (SCC1 subunit) and CTCF (both targeted by auxin-inducible degron and mEGFP reporter) in G1-synchronized HeLa cells. The majority of promoter contacts are lost (associated with transcriptional changes, SLAM-seq) but some are retained and gained. Cohesin-independent promoter contacts interact with active enhancers. Cohesin-dependent interactions are typically longer and associated with CTCF, while cohesin-independent interactions are shorter and associated with active promoters and enhancers. HiCUP, CHiCAGO, Chicdiff. Processed data, replicates of promoter-capture Hi-C data GEO GSE145735, replicates of SLAM-seq data GEO GSE145734

  • Promoter-enhancer predictions in 131 cell types and tissues using the Activity-By-Contact (ABC) Model, based on chromatin state (ATAC-seq) and 3D folding (consensus Hi-C). ABC model assumes an element’s quantitative effect on a gene should depend on its strength as an enhancer (Activity) weighted by how often it comes into 3D contact with the promoter of the gene (Contact), and that the relative contribution of an element on a gene’s expression (as assayed by the proportional decrease in expression following CRISPR-inhibition) should depend on that element’s effect divided by the total effect of all elements. Outperforms distance-based methods, 3D-based only, machine learning approaches. Enhancer-promoter predictions for GM12878, K562, liver, LNCAP, mESCs, NCCIT cells, more at Engreitz Lab page. GitHub repository broadinstitute/ABC-Enhancer-Gene-Prediction.

  • Promoter-enhancer interactions. Promoter-capture Hi-C, 27 human cell lines. Well-formatted data and hg19 genomic coordinates Supplementary material and http://www.3div.kr/capture_hic

  • Promoter capture Hi-C in 17 blood cell types. Chromatin interactions are cell type-specific. >50% interactions are one-to-one. Enriched in H3K27ac and H3K4me1 (active enhancers). GWAS loci enriched in PIRs. Table S3 lists prioritized genes/SNPs, for autoimmune diseases. HiCUP, CHiCAGO prodessing. More than 2,500 potential disease-associated genes are linked to GWAS SNPs. Raw and processed data.

Single-cell Hi-C

See Notes on single-cell Hi-C technologies, tools, and data repository

Micro-C

See the Micro-C section in the HiC_tools repository

GAM

Genome Architecture Mapping data

Imaging

  • MERFISH - Super-resolution imaging technology, reconstruction 3D structure in single cells at 30kb resolution, 1.2Mb region of Chr21 in IMR90 cells. Distance maps obtained by microscopy show small distance for loci within, and larger between, TADs. TAD-like structures exist in single cells. 2.5Mb region of Chr21 in HCT116 cells, cohesin depletion does not abolish TADs, only alter their preferential positioning. Multi-point (triplet) interactions are prevalent. TAD boundaries are highly heterogeneous in single cells. , diffraction-limited and STORM (stochastic optical reconstruction microscopy) imaging. GitHub

  • Single-cell level massively multiplexed FISH (MERFISH, sequential genome imaging) to measure 3D genome structure in context of gene expression and nuclear structures. Approx. 650 loci, 50kb resolution, on chr21 10.4-46.7Mb from the hg38 genome assembly, IMR90 cells, population average from approx. 12K chr21 copies, multiple rounds of hybridization. Investigation of TADs, A/B compartments, 87% agreement with bulk Hi-C. Association with cell type markers, transcription. Genome-scale imaging using barcodes, 1041 30kb loci covering autosomes and chrX of IMR90, over 5K cells, 5 replicates. Processed multiplexed FISH data and more, TXT format, GitHub

  • Parser of multiplexed single-cell imaging data from Bintu et al. 2018 and Su et al. 2020 - Take 3D coordinates of the regions as input and write the distance and contact matrices for these datasets.

CTCF

Notes on CTCF motifs and data

Integrative Hi-C

  • 3D structure mediates the effect of genetic variants on gene expression. 317 lymphoblastoid (LCL) and 78 fibroblast (FIB) cell lines, Hi-C data from Rao et al. 2014 paper. Regulatory elements identified from H3K4me1, H3K4me3, H3K27ac ChIP-seq. The regulatory activity is structured in 12,583 well-delimited cis-regulatory domains (CRDs) that respect the local chromatin organization into topologically associating domains (TADs) but constitute finer organization. 30 trans-regulatory hubs (TRHs) formed by CDRs on distinct chromosomes, associated with AB compartments and allelic regulation. Processed data - cQTLs - variants associated with chromatin peak activity; (cis/trans) eQTLs - variants associated with gene expression; aCRD-QTLs - variants associated with CRD activity; sCRD-QTLs - variants associated with CRD structure; chromatin peaks, and CRDs. For LCL and FIB cell lines, coordinates in hg19.
    Paper Delaneau, O., M. Zazhytska, C. Borel, G. Giannuzzi, G. Rey, C. Howald, S. Kumar, et al. “Chromatin Three-Dimensional Interactions Mediate Genetic Effects on Gene Expression.” Science (New York, N.Y.) 364, no. 6439 (03 2019). https://doi.org/10.1126/science.aat8266.

Misc

  • RNA-Chrom - database of RNA-chromatin interactions. Human & mouse. Manually curated. Data from "all-to-all" methods (MARGI, GRID-seq, ChAR-seq, iMARGI, RADICL-seq, Red-C) and "one-to-all" methods (RAP, CHART-seq, CgURO-seq, dChIRP-seq, ChOP-seq, CHIRT-seq), databases. Uniform processing. RNA- and DNA-centric searches. Video tutorial 1, tutorial 2. Download.
    Paper Ryabykh, G. K., S. V. Kuznetsov, Y. D Korostelev, A. I. Sigorskikh, A. A. Zharikova, and A. A. Mironov. “RNA-Chrom: A Manually-Curated Analytical Database of RNA–Chromatin Interactome.” Preprint. Bioinformatics, December 12, 2022. https://doi.org/10.1101/2022.12.10.519346.
  • Prioritization of COVID-19 candidate genes using 3D chromosomal topology. Applying COGS (Capture Hi-C Omnibus Gene Score), a statistical pipeline for linking GWAS variants with their target genes based on 3D chromatin interaction data. COVID-19 GWAS data. Promoter-capture Hi-C data from Javierre et al., “Lineage-Specific Genome Architecture Links Enhancers and Non-Coding Disease Variants to Target Gene Promoters” and Ho et al. "TOP1 inhibition therapy protects against SARS-CoV-2-induced lethal inflammation" studies (17 human primary cell types data and SARS-CoV-2-infected lung carcinoma cells data). Four prioritization approaches, summary in Supplementary Table S4. Biological analysis.
    Paper Thiecke, Michiel J., Emma J. Yang, Oliver S. Burren, Helen Ray-Jones, and Mikhail Spivakov. “[Prioritisation of Candidate Genes Underpinning COVID-19 Host Genetic Traits Based on High-Resolution 3D Chromosomal Topology](https://doi.org/10.3389/fgene.2021.745672).” Frontiers in Genetics 12 (October 25, 2021)