October 1, 2011 (Vol. 31, No. 17)
Juan Sandoval, Ph.D.
Manuel Esteller, M.D., Ph.D.
Pioneer Evolutionary Scientist’s Ideas Revisited in Light of Novel Research Findings
In 1809 Lamarck published Philosophie zoologique describing an intriguing theory by which the environment would exert a driving force on animals whose body plans, physiology, and behavior would have to adapt.1 This hypothesis, refuted by modern geneticists, was consigned to oblivion in favor of Darwin’s theory of evolution.
We now believe, however, that both theories can be reconciled to establish the principles for modern epigenetics: inheritance can play an important role in defining the phenotype of organisms by leaving the DNA sequence unaltered while modifying the expression of the genes.
Epigenetics may be defined in terms of the mechanisms that initiate and maintain heritable patterns of gene function and regulation in an inheritable manner without affecting the sequence of the genome. This explains how two identical genotypes can result in different phenotypes in responding to environmental stimuli.
The different mechanisms underlying epigenetics are post-translational modification in histone proteins, DNA methylation, chromatin remodeling, histone variants, and noncoding RNAs. During the last two decades the best-studied epigenetic processes have been histone modification and DNA methylation, although the others should lead to auspicious results in the near future.
DNA methylation was the first factor to be defined as an epigenetic marker based on the X-chromosome inactivation taking place in somatic cells.2,3 A major step forward in the field was the introduction of methylation-sensitive restriction enzymes as a way to detect the methylation state of CpG sites in higher eukaryotic genomes. This clarified the genomic distribution of methylated CpGs and their role in controlling the activity of the genes.4
Then in 1983, Bestor and Ingram purified the first DNA methyl transferase—dnmt15. In the early 90s, knock-out models led to the discovery of methyl binding domain (MBD) proteins. The first genetic ablation was DNA methyltransferase 1 (DNMT1), resulting in a profound deregulation of epigenetically silenced loci in the mouse genome.6
However, the transmission of heritable changes that impact gene activity in organisms with extremely low levels of DNA methylation, such as Drosophila melanogaster, suggested the existence of additional epigenetic mechanisms yet to be discovered.
Before the 70s, the activity of histone-mediated gene regulation had been misunderstood, considered to act as proteins that only helped in coiling DNA to optimize its packaging.7 However, in 1964, Alfrey proposed an active role for histone acetylation in gene expression.8 By using biochemical techniques with radioactive precursors, new histone modifications were discovered during the 1960s, although their functional significance remained unclear. Since then, research around these new players has experienced an awakening.
A major advance came with the determination of the crystal structure of the nucleosome at different resolutions, which provided evidence that histone tails could be modified.9,10 Later on, a critical connection between histone acetylation and gene activation was reported with the isolation of the Tetrahymena histone acetyltransferase gcn5 through an “in-gel” assay approach.11
At the same time, the first histone deacetylase was purified.12 Afterward, many other epigenetic-modifying complexes were discovered via biochemical studies using purified factors and DNA templates (e.g., the ATP-dependent nucleosome remodeling complex SWI/SNF).13
Different investigations finally demonstrated the connection between all epigenetic players that had been reported so far, by describing physical interactions among MBDs, histone deacetylases (HDACs), and chromatin remodelers.14,15 All this work gave rise to an exciting research field that has yielded important contributions to our understanding of human development and disease.
During the late 1980s and into the 1990s, research on DNA methylation intensified. At that time, there were two different strategies to assess the DNA-methylation patterns in organisms: a candidate-gene approach and restriction-enzyme-based methods. The advent of bisulphite conversion was a crucial step in epigenetic research. Through this chemical reaction, unmethylated cytosine residues are transformed into uracil while leaving 5-methylcytosine unaffected.
The implementation of this technique with genomic sequencing or PCR amplification (methylation specific PCR—MSP) allowed a sensitive and fast interrogation for DNA methylation at any target sequence and proved suitable for identification of DNA-methylation alterations in selected candidate-genes.
Restriction-enzyme-based methods conversely utilized different techniques to analyze DNA methylation in a genome-wide manner: restriction landmark genomic scanning16; differential methylation hybridization17 and amplification of intermethylated sites18. All of these tools took advantage of methylation-sensitive restriction enzymes to analyze a limited number of genomic sites.
However, use of these techniques does come with some drawbacks, such as the limited number of sequences that can be interrogated and the varied sensitivity exhibited by the restriction enzymes depending on the CG density, etc.
The advent of the chromatin immunoprecipitation technique (ChIP) was a fundamental contribution to the study of other epigenetic factors, mainly in histone modifications. ChIP is a powerful technique for analyzing targeted proteins that bind to particular sequences of DNA. From that moment on, many ChIP-grade antibodies that recognized most of the histone modifications and chromatin-modifying players were produced, increasing exponentially the knowledge of the relationship between epigenetic players and control of gene expression.
In 2001, Strahl and Allis compiled all this information about the interplay of different epigenetic marks and players, and formulated the histone code hypothesis.19
Microarray Platforms and OMICS
During the last decade, the availability of microarray platforms and the upcoming wave of omics have revolutionized the concept of analyzing gene regulation (including that involved in epigenomics) in an unbiased manner. Genetic unmasking opened up a new line of investigation for DNA methylation or histone modifications. Different strategies were designed to fully exploit the possibilities of the new large-scale platforms.
Genetic disruption of the major de novo DNMTs (DNMT1 and DNMT3) or truncation of mutated HDAC cells were used to identify differentially methylated or acetylated on a global genomic scale.20,21 Similarly, other strategies have taken advantage of the plasticity of epigenetic mechanisms by treating cells with pharmacological inhibitors of DNA methylation, such as the deocycytidine analogue 5-aza-2´-deocycytidine (5-aza) and histone deacetylase inhibitors, like valproic acid or trichostatin A. As a result, changes in gene expression due to aberrant methylation or histone deacetylation could be observed.
Limitations of these strategies were the pleiotropic effects produced by the disruption of DNMTs and the epigenetic drugs, and the lack of specificity.
To avoid these problems, new strategies based on methyl-cytosine enrichment were designed to study global epigenetic patterns at a whole-genome scale. The first method is the methylated–CpG island recovery assay (MIRA) that uses the high-binding affinity of the complex of GST-tagged-MBDs proteins to methylated DNA.22
Following a similar approach, the MeDIP-on-chip (Methylated DNA immuno precipitation+microarray) is based on the direct immunoprecipitation of methylated DNA using an antibody that recognizes 5-methylcytosine.23
The remaining epigenetic players were analyzed by ChIP-on-chip using antibodies that recognize histone modifications, chromatin modifying complexes, chromatin remodelers, or histone variants. Both MIRA and ChIP-on-chip need an amplification step for the enriched-methyl DNA and a subsequent hybridization in an array platform.
Nevertheless, the drawbacks of these methodologies lie on the differential binding efficiency depending on the CG content, which can yield biased results toward CPG-rich sequences and on the lack of single-base resolution. In parallel, a methylation-sensitive restriction enzyme HpaII coupled with ligation-mediated PCR approach has been used for global methylation analysis.
It is limited, however, by the recognition of a particular restriction DNA sequence and by the fact that LM-PCR can produce unspecific amplification.
Finally, the use of bisulphite treatment of DNA with hybridization on an array platform has permitted the interrogation of methylated DNA in a large scale and single-base resolution analysis.
There are different arrays commercially available for methylation profiling in bisulphite-converted samples, such as the GoldenGate assay, which interrogates the methylation state of up to 1,356 targeted CpG sites from 371 genes and the Infinium 27K, which analyzes 27,578 CpG sites from 14,475 genes.
The latest platform to come to market is the Infinium 450K, which covers 485,764 sites, including CNG sites, from 21,233 genes and noncoding RNAs24. Technological improvement has so far been able to circumvent previous limitations. Researchers very likely will soon be able to cover the methylation status of the 28 million CpG dinucleotides of the human methylome.
Genomic assays have experienced an enormous transformation due to the rapid technological development of next-generation sequencing. The possibility of sequencing millions of short DNA fragments in a single run is of crucial importance for the accomplishment of epigenomic studies.
In 2007, the laboratory of Wold and Myers contributed to the progression of global genomic-scale analysis by combining chromatin immunoprecipitation and massively parallel sequencing (ChIP-seq) to identify mammalian DNA sequences bound by transcription factors in vivo.25 Soon after, different laboratories used ChIP-seq for large-scale profiling of histone modifications, chromatin modifying complexes, and chromatin remodelers.26,27
As expected, whole-genome sequencing has been used for analyzing DNA methylation-enriched samples using different approaches: bisulphite conversion MethylC-seq, MeDIP-seq, MBD-seq, methylation-sensitive restriction enzyme sequencing (MRE-seq)28-32. ChIP-seq has become the method of choice for epigenomic analysis because of the higher resolution (single nucleotide), the low incidence of artifacts (since it avoids the noise generated by hybridization steps), larger dynamic range that circumvents saturation of intensity signals, and unlimited genomic coverage.
Minimal technical defects such as sequencing errors toward the end of each read, bias toward GC-rich content in fragment selection, or loss of sensitivity or specificity in detection of enriched regions with low number of reads have been reduced as technology has advanced. Nevertheless, the major drawback with ChIP-seq is the high cost and availability.
In store for the near future are an unprecedented number of datasets, including epigenomic, transcriptomic, proteomic, genomic, and diverse technology platforms. Multiple challenges and issues are awaiting researchers integrating and processing the data.
Regarding data analysis, it is necessary to standardize the storage, transfer, and compilation of such a huge quantity of data. In addition, user-friendly peak-finders and alignment algorithms need to be developed to minimize variability in the identification of enriched regions and between different sequencing platforms.
However, a major challenge is to perform global integrative analysis of different omics datasets to understand the mechanisms of gene regulation and the biology of complex systems. In this sense, many approaches can be taken to analyze the data resulting in correlations between epigenetic marks with annotated functional features of genes (definition, structure, and ontology).
Other approaches could be to compare epigenomics with transcriptomics, noncoding RNAs (ncRNAs) or genomics to identify imprinted genes or splicing RNA processing.33-35
In conclusion, the ultimate goal of epigenomics is to map any epigenetic variant and with the integrated analysis of complementary omics data, being able to connect it to a specific phenotype, such as prognostic and predictive markers for cancer patients. One powerful proof-of-principle of how epigenomics can help in translational medicine is the demonstration that DNA methylomes identify the primary tumor type of the metastases of unknown origin.36
Further basic research will impact the clinic and will allow us to design tailored therapies with higher effectiveness and minimal side effects. One patient, one treatment and at the precise time.
1 Lamarck JBPA (1809). Philosophie zoologique: ou Exposition des considâerations relative áa l’histoire naturelle des animaux.
2 Riggs AD (1975). Cytogenet. Cell Genet. 14:9–25.
3 Holliday R, Pugh JE (1975). Science 187:226–232.
4 Bird AP, Southern EM (1978). J. Mol. Biol. 118:27–47.
5 Bestor TH, Ingram VM (1983). Proc Natl Acad Sci U S A. 80(18):5559-63.
6 Li E., Bestor TH, Jaenisch R (1992). Cell 69:915–926.
7 Stedman E. (1950). Nature 166(4227):780-1.
8 Allfrey V, Faulkner RM, Mirsky AE (1964). Proc. Natl. Acad. Sci. U.S.A. 51:786–794.
9 Kornberg RD, Thomas JO (1974). Science 184(139):865-8.
10 Richmond TJ, Finch JT, Rushton B, Rhodes D, Klug A (1984). Nature 311(5986):532-7.
11 Brownell JE, Zhou J, Ranalli T, Kobayashi R, Edmondson DG, Roth SY, Allis CD (1996). Cell 84(6):843-51.
12 Taunton J, Hassig CA, Schreiber SL (1996). Science 272(5260):408-11.
13 Peterson CL, Herskowitz I (1992). Cell 68(3):573-83.
14 Jones PL, Veenstra GJ, Wade PA, Vermaak D, Kass SU, Landsberger N, Strouboulis J, Wolffe AP. (1998). Nat Genet. 19(2):187-91.
15 Wade PA, Gegonne A, Jones PL, Ballestar E, Aubry F, Wolffe AP (1999). Nat Genet. 23(1):62-6.
16 Hatada I, Hayashizaki Y, Hirotsune S, Komatsubara H, Mukai T (1991). Proc Natl Acad Sci U S A 88(21):9523-7.
17 DeRisi J, Penland L, Brown PO, Bittner ML, Meltzer PS, Ray M, Chen Y, Su YA, Trent JM (1996). Nat. Genet. 14: 457-460,.
18 Frigola J, Ribas M, Risques RA, Peinado MA (2002). Nucleic Acids Res. 30(7):e28.
19 Strahl BD, Allis CD (2000). Nature 403(6765):41-5.
20 Rhee I, Bachman KE, Park BH, Jair KW, Yen RW, Schuebel KE, Cui H, Feinberg AP, Lengauer C, Kinzler KW, Baylin SB, Vogelstein B (2002). Nature 416(6880):552-6.
21 Ropero S, Fraga MF, Ballestar E, Hamelin R, Yamamoto H, Boix-Chornet M, Caballero R, Alaminos M, Setien F, Paz MF, Herranz M, Palacios J, Arango D, Orntoft TF, Aaltonen LA, Schwartz S Jr, Esteller M (2006). Nat Genet. 38(5):566-9.
22 Jiang CL, Jin SG, Pfeifer GP (2004). J Biol Chem. 279(50):52456-64.
23 Jacinto FV, Ballestar E, Ropero S, Esteller M (2007). Cancer Res. 67(24):11481-6.
24 Sandoval J, Heyn HA, Moran S, Serra-Musach J, Pujana MA, Bibikova M, Esteller M (2011). Epigenetics, Vol. 6, Issue 6 .
25 Johnson DS, Mortazavi A, Myers RM, Wold B (2007). Science. 316(5830):1497-502.
26 Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K (2007). Cell 129(4):823-37.
27 Robertson AG, Bilenky M, Tam A, Zhao Y, Zeng T, Thiessen N, Cezard T, Fejes AP, Wederell ED, Cullum R, Euskirchen G, Krzywinski M, Birol I, Snyder M, Hoodless PA, Hirst M, Marra MA, Jones SJ (2008). Genome Res. 18(12):1906-17.
28 Lister R, O’Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH, Ecker JR (2008). Cell 133(3):523-36.
29 Down TA, Rakyan VK, Turner DJ, Flicek P, Li H, Kulesha E, Gräf S, Johnson N, Herrero J, Tomazou EM, Thorne NP, Bäckdahl L, Herberth M, Howe KL, Jackson DK, Miretti MM, Marioni JC, Birney E, Hubbard TJ, Durbin R, Tavaré S, Beck S (2008). Nat Biotechnol. 26(7):779-85.
30 Serre D, Lee BH, Ting AH (2010). Nucleic Acids Res. 38(2):391-9.
31 Cokus SJ, Feng S, Zhang X, Chen Z, Merriman B, Haudenschild CD, Pradhan S, Nelson SF, Pellegrini M, Jacobsen SE ( 2008). Nature 452(7184):215-9.
32 Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo QM, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR (2009). Nature 462(7271):315-22.
33 Edwards CA, Ferguson-Smith AC (2007). Curr Opin Cell Biol. 19(3):281-9.
34 Marson A, Levine SS, Cole MF, Frampton GM, Brambrink T, Johnstone S, Guenther MG, Johnston WK, Wernig M, Newman J, Calabrese JM, Dennis LM, Volkert TL, Gupta S, Love J, Hannett N, Sharp PA, Bartel DP, Jaenisch R, Young RA (2008). Cell. 134(3):521-33.
35 Khalil AM, Guttman M, Huarte M, Garber M, Raj A, Rivea Morales D, Thomas K, Presser A, Bernstein BE, van Oudenaarden A, Regev A, Lander ES, Rinn JL (2009). Proc Natl Acad Sci U S A. 106(28):11667-72.
36 Fernandez AF, Assenov Y, Martin-Subero JI, Balint B, Siebert R, Taniguchi H, Yamamoto H, Hidalgo M, Tan AC, Galm O, Ferrer I, Sanchez-Cespedes M, Villanueva A, Carmona FJ, Sanchez-Mut JV, Berdasco M, Moreno V, Capella G, Monk D, Ballestar E, Ropero S, Martinez R, Sanchez-Carbayo M, Prosper F, Agirre X, Fraga MF, Graña O, Perez-Jurado L, Mora J, Puig S, Prat J, Badimon L, Puca AA, Meltzer SJ, Lengauer T, Bridgewater J, Bock C, Esteller M. (2011). Genome Research, PMID:21613409, 2011.
Juan Sandoval, Ph.D., is a postdoctoral researcher in the Cancer Epigenetics and Biology Program (PEBC), Bellvitge Biomedical Research Institute, L’Hospitalet. Manuel Esteller, M.D., Ph.D. ([email protected]), is director of the PEBC and research professor at the Institucio Catalana de Recerca i Estudis Avançats.