JAX BXD Germ Cells ChIP-seq (Aug20) edgeR

Download datasets and supplementary data files

Summary

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6404261/

The epigenetic landscape varies greatly among cell types. Although a variety of writers, readers, and erasers of epigenetic features are known, we have little information about the underlying regulatory systems controlling the establishment and maintenance of these features. Here, we have explored how natural genetic variation affects the epigenome in mice. Studying levels of H3K4me3, a histone modification at sites such as promoters, enhancers, and recombination hotspots, we found tissue-specific trans-regulation of H3K4me3 levels in four highly diverse cell types: male germ cells, embryonic stem cells, hepatocytes, and cardiomyocytes. To identify the genetic loci involved, we measured H3K4me3 levels in male germ cells in a mapping population of 59 BXD recombinant inbred lines. We found extensive trans-regulation of H3K4me3 peaks, including six major histone quantitative trait loci (QTL). These chromatin regulatory loci act dominantly to suppress H3K4me3, which at hotspots reduces the likelihood of subsequent DNA double-strand breaks. QTL locations do not correspond with genes encoding enzymes known to metabolize chromatin features. Instead their locations match clusters of zinc finger genes, making these possible candidates that explain the dominant suppression of H3K4me3. Collectively, these data describe an extensive, set of chromatin regulatory loci that control the epigenetic landscape.

About cases

C57BL/6J (stock number 000664), DBA/2J (stock number 000671), B6D2 F1/J hybrid (stock number 100006), and all BXD RI mice were obtained from The Jackson Laboratory (Bar Harbor, ME). All animal experiments were approved by the Animal Care and Use Committee of The Jackson Laboratory (summary #04008 and #16043).

About tissue

Testicular germ cell enrichment was performed on 14-day postpartum male mice as previously reported (Baker et al. 2014). This cell preparation removes somatic Sertoli and Leydig cells and results in >90% enrichment of germ cells, of which nearly 50% are spermatogonia (Ball et al. 2016).

Individual low-passage mouse ESCs were derived using protocols outlined in Czechanski et al. (2014). Briefly, 6- to 8-week-old females are mated to stud males and checked each morning for plugs. Pregnant females are euthanized on embryonic day 3.5 and the uterine horn is flushed to remove embryos. Embryos are visualized under a dissecting microscope and blastocysts are transferred to 2i (2i:CHIR99021 and PD0325901) (Ying et al. 2008) serum-free media for outgrowth of the inner cell mass. Blastocysts are allowed to hatch and attach to a mouse embryonic fibroblast (MEF) feeder layer, and the resulting outgrowth is monitored daily and fed for 8–11 days. The emergent ESCs are disaggregated and passaged onto new MEF feeders. Cultures during this time are closely monitored for unusually rapid growth (potentially indicating karyotypic instability), signs of deterioration including vacuolated cytoplasm, detachment of cells from colonies and debris, and possible signs of contamination. Successful ESC cultures were maintained on MEF feeders in serum containing 2i media supplemented with Leukemia inhibiting factor (2i/LIF) (Kiyonari et al. 2010) to maintain high levels of NANOG expression, which indicates ground-state pluripotency (Ying et al. 2008; Czechanski et al. 2014). Prior to preparing for chromatin isolation, mouse ESCs were enzymatically disassociated using trypsin and MEFs were removed by serial plating on gelatin-coated plates to which MEFs adsorb preferentially; for this, ESCs and MEFs are incubated in 2i/LIF media on fresh plates for 15 min to allow the larger MEFs to quickly attach to the plates. ESCs are aspirated and the plating procedure repeated once to further remove MEFs. ESCs were collected by centrifugation, resuspended in PBS, and cross-linked using formaldehyde.

For hepatocyte isolation and purification, livers from 8-week-old female mice were perfused using a modified EGTA–collagenase perfusion protocol (Neufeld 1997). All perfusions and hepatocyte purifications were done at the same time of the day to avoid possible circadian effects on any studied parameter. EGTA buffer was used to flush the blood out of the liver and start to digest the desmosomes connecting the liver cells. After 35 ml of the 1× EGTA solution was passed through the liver, it was replaced with 7–10 ml of 1× Leffert’s buffer to flush out the EGTA, which otherwise chelates the calcium ions necessary for collagenase activity in the next step when the liver is digested by perfusion with 25–50 ml of Liberase solution (∼4.3 Wünsch units). After perfusion, the liver was removed from the abdominal cavity and passed through Nitex 80-μm nylon mesh, using extra ice-cold Leffert’s buffer with 0.02% CaCl2and a rubber policeman. Hepatocytes were purified from the remaining cells by two consecutive centrifugations for 4 min, 50 × g each, leaving the other, smaller cell types in suspension. After each spin, the solution was decanted as waste, and the enriched cell pellet of hepatocytes was resuspended in 30 ml ice-cold Leffert’s buffer with 0.02% CaCl2. After the second centrifugation, the cell pellet contained >98.6% hepatocytes.

For cardiomyocyte isolation 8-week-old female mice were euthanized and the chest opened to expose the heart. The descending aorta and inferior vena cava were cut and an EDTA buffer was injected into the apex of the right ventricle to flush the heart. The ascending aorta was clamped and the heart transferred to a petri dish and fixed by perfusion of EDTA buffer containing 4% formaldehyde via the left ventricle. The formaldehyde was quenched by perfusing the heart with 125 mM glycine, and digested by perfusion with collagenase buffer. The ventricles were rent into smaller pieces, and triturated to complete cellular dissociation into a single-cell suspension. Cells were then filtered through a 100-μm strainer to remove tissue fragments and centrifuged at a very low speed to obtain a highly enriched fraction of fixed cardiomyocytes.

About data processing

All sequenced B6, D2, F1, and BXD H3K4me3 ChIP libraries, as well as all control input DNA samples were aligned utilizing bwa version 0.7.9a (Li and Durbin 2009). B6 parental samples were aligned to the Genome Reference Consortium Mouse Build 38 (mm10) and D2 parental samples were aligned to the de novo REL-1509 assembly, including all unplaced scaffolds, from the Mouse Genomes Project (Yalcin et al.2012).

To ensure that H3K4me3 peaks were properly quantified across divergent genomes, we began by building a comprehensive “peakome” representing all potential H3K4me3 peaks found in the two parents. H3K4me3 peaks for B6 and D2 were called independently, utilizing alignment data from three replicate samples and one DNA input sample. Reads were filtered for a mapq alignment metric of 60 and an alignment sequence having no indels present across the entire length of the sequencing read, typically 100 bp H3K4me3 peaks were called utilizing MACS version 1.4.2 (Zhang et al. 2008) and peaks having a false discovery rate (FDR) of <1% found in two out of three replicates were accepted. Final genomic intervals for each H3K4me3 peak for each strain were derived by merging the peaks from the corresponding replicate samples using bedtools (Quinlan and Hall 2010). To link syntenic regions between B6 and D2 assemblies, which each have their own coordinate system, sequences from these genomic intervals were aligned to their alternative genome using reciprocal BLAST. In some cases, a sequence interval comprising an H3K4me3 peak in one strain aligned to multiple adjacent intervals in the alternative genome. If the sequences of these peaks in the alternate strain all fell within the boundaries of the single peak, they were merged. The boundaries of these merged peaks included the incorporated sequences from both strains. Because there are also H3K4me3 peaks that are strain specific, these peaks were accepted if, and only if, the mapped interval had a unique sequence that was found in the proper syntentic order within the alternative genome lacking that H3K4me3 peak. The final combined peakome between B6 and D2 mice was created by selecting only peaks appropriately linked across each strain, assuring that each H3K4me3 peak reciprocally aligned to only one peak in the alternative genome after merging, and that all peaks were in the same order along the chromosomes in both genomes (Supplemental Material, Table S2). Using the H3K4me3 peaks locations derived from the parental strains, final read counts for B6, D2, F1 hybrids, and BXDs were obtained by counting reads within the coordinate boundaries of the peakome intervals.

To improve mapping accuracy and utilize known sequence variation between strains, all BXDs and F1hybrid samples were aligned separately to both the mm10 reference and the de novo D2 assembly. To reduce error in quantification of H3K4me3 levels due to genomic regions containing repetitive sequences, we removed reads with multiple alignments and retained reads with alignment metric of 60 that lacked small indels, which can often indicate misalignment. Subsequently, for each genomic interval in the peakome, final reads counts were summed for those that mapped uniquely to one of the assemblies along with those that mapped equally well to both B6 and D2 genome assemblies.

Acknowledgment

This manuscript is dedicated to the memory of Pavlina Petkova, a wonderful scientist, colleague, and wonderful friend. We thank members of the Baker, Paigen, Petkov, and Carter laboratories for their discussion of the data and manuscript. This work was assisted by The Jackson Laboratory scientific services, which are supported through National Institutes of Health Cancer Core grant CA34196. BXD ESC lines were kindly provided by Anne Czechanski and Laura Reinholdt, funded by the Special Mouse Strain Resource OD011102-18. Funding for the work was provided by NIGMS F32-GM101736 and The Jackson Laboratory start-up funds supporting C.L.B., and P01-GM099640 to K.P. and G.W.C.

Specifics of this data set

This data has all of the normalization steps already applied. Normalized read counts. Essentially we used a method out of the R package edgeR to normalize for both read depth and composition, followed by log2 transformation, and then subtraction of PCA1.