Biology:FAIRE-Seq

From HandWiki

FAIRE-Seq (Formaldehyde-Assisted Isolation of Regulatory Elements) is a method in molecular biology used for determining the sequences of DNA regions in the genome associated with regulatory activity.[1] The technique was developed in the laboratory of Jason D. Lieb at the University of North Carolina, Chapel Hill. In contrast to DNase-Seq, the FAIRE-Seq protocol doesn't require the permeabilization of cells or isolation of nuclei, and can analyse any cell type. In a study of seven diverse human cell types, DNase-seq and FAIRE-seq produced strong cross-validation, with each cell type having 1-2% of the human genome as open chromatin.

Workflow

The protocol is based on the fact that the formaldehyde cross-linking is more efficient in nucleosome-bound DNA than it is in nucleosome-depleted regions of the genome. This method then segregates the non cross-linked DNA that is usually found in open chromatin, which is then sequenced. The protocol consists of cross linking, phenol extraction and sequencing the DNA in aqueous phase.

FAIRE

FAIRE uses the biochemical properties of protein-bound DNA to separate nucleosome-depleted regions in the genome. Cells will be subjected to cross-linking, ensuring that the interaction between the nucleosomes and DNA are fixed. After sonication, the fragmented and fixed DNA is separated using a phenol-chloroform extraction. This method creates two phases, an organic and an aqueous phase. Due to their biochemical properties, the DNA fragments cross-linked to nucleosomes will preferentially sit in the organic phase. Nucleosome depleted or ‘open’ regions on the other hand will be found in the aqueous phase. By specifically extracting the aqueous phase, only nucleosome-depleted regions will be purified and enriched.[1]

Sequencing

FAIRE-extracted DNA fragments can be analyzed in a high-throughput way using next-generation sequencing techniques. In general, libraries are made by ligating specific adapters to the DNA fragments that allow them to cluster on a platform and be amplified resulting in the DNA sequences being read/determined, and this in parallel for millions of the DNA fragments.

Depending on the size of the genome FAIRE-seq is performed on, a minimum of reads is required to create an appropriate coverage of the data, ensuring a proper signal can be determined.[2][3] In addition, a reference or input genome, which has not been cross-linked, is often sequenced alongside to determine the level of background noise.

Note that the extracted FAIRE-fragments can be quantified in an alternative method by using quantitative PCR. However, this method does not allow a genome wide / high-throughput quantification of the extracted fragments.

Sensitivity

There are several aspects of FAIRE-seq that require attention when analysing and interpreting the data. For one, it has been stated that FAIRE-seq will have a higher coverage at enhancer regions over promoter regions.[4] This is in contrast to the alternative method of DNase-seq who is known to show a higher sensitivity towards promoter regions. In addition, FAIRE-seq has been stated to show prefers for internal introns and exons.[5] In general it is also believed that FAIRE-seq data displays a higher background level, making it a less sensitive method.[6]

Computational analysis

In a first step FAIRE-seq data are mapped to the reference genome of the model organism used.

Next, the identification of genomic regions with open chromatin, is done by using a peak calling algorithm. Different tools offer packages to do this (e.g. ChIPOTle[7] ZINBA[8] and MACS2[9]). ChIPOTle uses a sliding window of 300bp to identify statistically significant signals. In contrast, MACS2 identifies the enriched signal by combining the parameter callpeak with other options like 'broad', 'broad cutoff', 'no model' or 'shift'. ZINBA is a generic algorithm for detection of enrichment in short read dataset.[10] It thus helps in the accurate detection of signal in complex datasets having low signal-to noise ratio.

BedTools[11] is used to merge the enriched regions residing close to each other to form COREs (Cluster of open regulatory elements). This helps in the identification of chromatin accessible regions and gene regulation patterns which would have been undetectable otherwise, considering the lower resolution FAIRE-seq often brings with it.

Data is typically visualized as tracks (e.g. bigWig) and can be uploaded to the UCSC genome browser.[12]

The major limitation of this method, i.e. the low signal-to-noise ratio compared to other chromatin accessibility assays, makes the computational interpretation of these data very difficult.[13]

Alternative methods

There are several methods that can be used as an alternative to FAIRE-seq. DNase-seq uses the ability of the DNase I enzyme to cleave free/open/accessible DNA to identify and sequence open chromatin.[14][15] The subsequently developed ATAC-seq employs the Tn5 transposase, which inserts specified fragments or transposons into accessible regions of the genome to identify and sequence open chromatin.[16]

References

  1. 1.0 1.1 Giresi, PG; Kim, J; McDaniell, RM; Iyer, VR; Lieb, JD (Jun 2007). "FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin.". Genome Research 17 (6): 877–85. doi:10.1101/gr.5533506. PMID 17179217. 
  2. Landt, Stephen G.; Marinov, Georgi K.; Kundaje, Anshul; Kheradpour, Pouya; Pauli, Florencia; Batzoglou, Serafim; Bernstein, Bradley E.; Bickel, Peter et al. (2012-09-01). "ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia". Genome Research 22 (9): 1813–1831. doi:10.1101/gr.136184.111. ISSN 1549-5469. PMID 22955991. 
  3. Sims, David; Sudbery, Ian; Ilott, Nicholas E.; Heger, Andreas; Ponting, Chris P. (2014). "Sequencing depth and coverage: key considerations in genomic analyses". Nature Reviews Genetics 15 (2): 121–132. doi:10.1038/nrg3642. PMID 24434847. 
  4. Kumar, Vibhor; Muratani, Masafumi; Rayan, Nirmala Arul; Kraus, Petra; Lufkin, Thomas; Ng, Huck Hui; Prabhakar, Shyam (2013-07-01). "Uniform, optimal signal processing of mapped deep-sequencing data". Nature Biotechnology 31 (7): 615–622. doi:10.1038/nbt.2596. ISSN 1546-1696. PMID 23770639. 
  5. Song, Lingyun; Zhang, Zhancheng; Grasfeder, Linda L.; Boyle, Alan P.; Giresi, Paul G.; Lee, Bum-Kyu; Sheffield, Nathan C.; Gräf, Stefan et al. (2011-10-01). "Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity" (in en). Genome Research 21 (10): 1757–1767. doi:10.1101/gr.121541.111. ISSN 1088-9051. PMID 21750106. PMC 3202292. http://genome.cshlp.org/content/21/10/1757. 
  6. Tsompana, Maria; Buck, Michael J (2014-11-20). "Chromatin accessibility: a window into the genome" (in En). Epigenetics & Chromatin 7 (1): 33. doi:10.1186/1756-8935-7-33. PMID 25473421. 
  7. Buck, Michael J; Nobel, Andrew B; Lieb, Jason D (2005-01-01). "ChIPOTle: a user-friendly tool for the analysis of ChIP-chip data". Genome Biology 6 (11): R97. doi:10.1186/gb-2005-6-11-r97. ISSN 1465-6906. PMID 16277752. 
  8. Rashid, Naim U.; Giresi, Paul G.; Ibrahim, Joseph G.; Sun, Wei; Lieb, Jason D. (2011-01-01). "ZINBA integrates local covariates with DNA-seq data to identify broad and narrow regions of enrichment, even within amplified genomic regions". Genome Biology 12 (7): R67. doi:10.1186/gb-2011-12-7-r67. ISSN 1474-760X. PMID 21787385. 
  9. Zhang, Yong; Liu, Tao; Meyer, Clifford A.; Eeckhoute, Jérôme; Johnson, David S.; Bernstein, Bradley E.; Nusbaum, Chad; Myers, Richard M. et al. (2008-01-01). "Model-based analysis of ChIP-Seq (MACS)". Genome Biology 9 (9): R137. doi:10.1186/gb-2008-9-9-r137. ISSN 1474-760X. PMID 18798982. 
  10. Koohy, Hashem; Down, Thomas A.; Spivakov, Mikhail; Hubbard, Tim (2014). "A Comparison of Peak Callers Used for DNase-Seq Data". PLOS ONE 9 (5): e96303. doi:10.1371/journal.pone.0096303. PMID 24810143. Bibcode2014PLoSO...996303K. 
  11. Quinlan, Aaron R.; Hall, Ira M. (2010-03-15). "BEDTools: a flexible suite of utilities for comparing genomic features" (in en). Bioinformatics 26 (6): 841–842. doi:10.1093/bioinformatics/btq033. ISSN 1367-4803. PMID 20110278. 
  12. Hinrichs, A. S.; Karolchik, D.; Baertsch, R.; Barber, G. P.; Bejerano, G.; Clawson, H.; Diekhans, M.; Furey, T. S. et al. (2006-01-01). "The UCSC Genome Browser Database: update 2006" (in en). Nucleic Acids Research 34 (suppl 1): D590–D598. doi:10.1093/nar/gkj144. ISSN 0305-1048. PMID 16381938. 
  13. Tsompana, M; Buck, MJ (2014-11-20). "Chromatin accessibility: a window into the genome" (in En). Epigenetics & Chromatin 7 (1): 33. doi:10.1186/1756-8935-7-33. PMID 25473421. 
  14. Boyle, Alan P.; Davis, Sean; Shulha, Hennady P.; Meltzer, Paul; Margulies, Elliott H.; Weng, Zhiping; Furey, Terrence S.; Crawford, Gregory E. (2008-01-25). "High-resolution mapping and characterization of open chromatin across the genome". Cell 132 (2): 311–322. doi:10.1016/j.cell.2007.12.014. ISSN 1097-4172. PMID 18243105. 
  15. Crawford, Gregory E.; Holt, Ingeborg E.; Whittle, James; Webb, Bryn D.; Tai, Denise; Davis, Sean; Margulies, Elliott H.; Chen, YiDong et al. (2006-01-01). "Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS)". Genome Research 16 (1): 123–131. doi:10.1101/gr.4074106. ISSN 1088-9051. PMID 16344561. 
  16. Buenrostro, Jason D.; Giresi, Paul G.; Zaba, Lisa C.; Chang, Howard Y.; Greenleaf, William J. (2013-12-01). "Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position". Nature Methods 10 (12): 1213–1218. doi:10.1038/nmeth.2688. ISSN 1548-7105. PMID 24097267.