Cytosine’s Chemical Biology Gets “Curiouser and Curiouser!”
After following the White Rabbit down a large rabbit-hole, Lewis Carroll’s Alice found that curious happenings in Wonderland became “curiouser and curiouser!” Reading about cytosine’s curious chemical biology in epigenetics gave me the same impression, and left me a little dizzy wondering about what will be discovered next, and what it all means.
Before plunging into the cytosine-hole, so to speak, a bit of introductory information is offered here for readers unfamiliar with epigenetics, while those who know about epigenetics can skip to the next section.
Epigenetics is the study of changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence, some of which have been shown to be heritable —hence the prefix epi- (Greek: επί- over, above, outer) used with the root-word genetics (the branch of biology that deals with heredity and genetic variations).
Epigenetics refers to functionally relevant modifications to the genome that do not involve a change in the nucleotide sequence. As depicted below, such modifications originally included DNA methylation and histone modification, both of which serve to regulate gene expression without altering the underlying DNA sequence.
More specifically, DNA methylation refers to 5-methylcytosine (5mC), which is the initial focus of this blog post. In contrast, there are numerous types of histone modifications, as detailed elsewhere, and these will not be discussed further. However, keep in mind that there are linkage patterns and paradigms between DNA methylation and histone modification, as reviewed in Nature.
The existence of 5mC as a minor base in mammalian DNA was first reported in 1948 in a publication in J. Biol. Chem. by Hotchkiss at The Rockefeller Institute for Medical Research in New York, who separated the nucleic acids of calf thymus DNA using paper chromatography. Ironically, Hotchkiss called this minor base “epicytosine” because of its similarity to cytosine rather than any association with epigenetics; however, assigning the structure as 5mC did not occur until 1951 in work published in J. Amer. Chem. Soc. by Cohn at Oak Ridge National Laboratory. In a review entitled “Epigenetics: A Historical Overview,” Holliday points out that mechanistic models for DNA methylation-based epigenetics were first proposed in 1975 in independent publications by Riggs and by Holliday & Pugh.
Creating and “Erasing” 5mC
In mammals, 5mC occurs within CpG dinucleotides—mostly in CpG-rich promoter regions of genes—and is required for allele-specific expression of imprinted genes, transcriptional repression of retrotransposons, and for X chromosome inactivation in females. DNA methylation patterns are established early in the zygote by the de novo DNA methyltransferases 3A and 3B (DNMT3A/3B), and they are conserved during cell division by maintenance DNA methyltransferase DNMT1. These enzymes transfer the methyl group from S-adenosylmethionine to the carbon-5 position of cytosine. In their recent review, Delatte & Fuks state that “[a] longstanding mystery in the epigenetic field surrounds the mechanisms allowing transitions from the methylated to the unmethylated state.” DNA demethylation can occur both passively and actively. “Passive” DNA demethylation refers to progressive dilution of 5mC by exclusion of DNMT1 from the replication fork during mitosis. “Active” DNA demthylation requires rapid, replication-independent enzymatic removal of a methyl group or, far more likely, an intact 5mC-containing moiety given the chemical stability of the C-C bond between carbon-5 and the methyl group. Delatte & Fuks note that “[s]uch 5mC ‘erasers’ have been intensely sought but have long remained elusive.”
That changed when Rao and coworkers suggested in Science in 2009 that TET (“ten eleven translocation”) enzymes and 5-hydroxymethylcytosine (5hmC) might be involved in a pathway leading to unmodified cytosine. They showed that TET1, a fusion partner of the MLL gene in acute myeloid leukemia, is a 2-oxoglutarate (2OG)- and Fe(II)-dependent enzyme that catalyzes conversion of 5mC to 5hmC in cultured cells and in vitro. In addition, hmC was shown to be present in the genome of mouse embryonic stem cells, and hmC levels decrease upon RNA interference-mediated depletion of TET1. Analogous activity was attributed to TET2 and TET3 in a publication in Nature in 2010 by Zhang and coworkers, who demonstrated that TET1 has an important role in mouse embryonic stem (ES) cell maintenance through maintaining the expression of Nanog in ES cells.
Cytosine’s chemical biology became—as Alice exclaimed—curiouser and curiouser when TET enzymes were reported in Science in 2011 by Zhang and coworkers to catalyze serial oxidation of 5mC beyond 5hmC to 5-formylcytosine (5fC) and 5-carboxycytosine (5caC) in mouse ES cells and mouse organs. These investigators concluded that this finding raised the possibility that DNA demethylation may occur through TET-catalyzed oxidation followed by either enzymatic decarboxylation—akin to what is known for thymine—or the base-excision DNA repair (BER) pathway. Evidence for the latter possibility was provided in the same issue of Science by He et al., who reported that 5mC and 5hmC were oxidized to 5caC by TET in vitro and in cultured cells. In addition, 5caC was specifically recognized and excised by thymine-DNA glycosylase (TDG). Depletion of TDG in mouse embyronic stem cells led to accumulation of 5caC to a readily detectable level. It was concluded that oxidation of 5mC by TET followed by TDG-mediated base-excision of 5caC constitutes a pathway for active DNA demethylation.
I think you’ll agree that these cytosine-related biochemical transformations, depicted below along with other newly proposed conversions, represent a wondrous process for “erasing!”
As in Alice’s Wonderland, cytosine’s wondrously curious chemical biology gets even curiouser and curiouser by intersecting, so to speak, with uracil (U). Apart from the direct removal of 5fC and 5caC by TDG, it was independently proposed in 2011 by Guo et al. and Cortellino et al. that 5hmC in DNA can be deaminated, by AID (activation-induced deaminase)/APOBEC (apolipoprotein B mRNA-editing enzyme complex) families of cytidine deaminases, to yield 5-hydroxymethyluracil (5hmU). While 5hmC in DNA is a poor substrate for TDG, 5hmU, when paired with a guanine, can be readily excised by DNA glycosylases such as TDG. Thus, oxidation of 5mC to 5hmC by TET, deamination of the latter nucleobase by AID/APOBECs and TDG-induced BER of the resulting 5hmU may also give rise to active cytosine demethylation in mammals.
The involvement of this sequential oxidation-deamination mechanism in active cytosine demethylation was challenged in 2012 by Nabel et al. based on the apparent lack of significant biochemical activity of recombinant AID or APOBEC toward 5hmC deamination in vitro or in cultured cells because of failure in detecting 5hmU. Nevertheless, in 2013 Liu et al. cautioned that it remains possible that such deamination may occur in specific cellular context(s), and that more sensitive detection methods could prove useful. To this end, they developed powerful reversed-phase HPLC coupled with tandem mass spectrometry (LC-MS/MS/MS) methodology, along with the use of stable isotope-labeled standards, for much more sensitive and accurate measurements of deoxy 5hmC, 5fC, 5caC and 5hmU. They found that overexpression of the catalytic domain of human TET1 led to marked increases in the levels of deoxy 5hmC, 5fC and 5caC, but only a modest increase in 5hmU in genomic DNA of cultured human cells and multiple mammalian tissues.
At the risk of confusing you, it’s worth pointing out that 5hmU in DNA is called “Base J.” Interestingly, J is present in all kinetoplastid flagellates studied—including Trypanosoma and Leishmania—but absent from other eukaryotes, prokaryotes and viruses. J replaces ~0.5% of T in the nuclear DNA of kinetoplastida and is mainly present in the telomeric repeat sequence (GGGTTA)n. Synthesis of J-base containing DNA oligos by chemical methods allowed the identification of a 93kDa J-binding protein 1 (JBP1) in extracts of T. brucei, Leishmania species and Crithidia fasciculata. It is hypothesized that JBP1 catalyzes the ﬁrst and rate-limiting step in J biosynthesis, the hydroxylation of T in DNA. For references and very interesting molecular-level details of conformational dynamics of binding of JBP1 to DNA with J (5hmU) or 5hmC, see recent work by Heidebrecht et al.
What’s next for C—the “wild card” base in DNA?
In a review of this rapidly evolving and curious molecular biology of cytosine, Nabel et al. offer the view that, “[t]aken together, this rich medley of alterations renders cytosine a genomic ‘wild card’, whose dependent functions make the base far more than a static letter in the code of life.” They also offer the following opinions on future directions.
First, there are pressing questions that need to be explored related to whether cytosine is endowed with a unique set of chemical properties that lead to its remarkable methylation, oxidation, and deamination biochemistry. Might there be other epigenetic DNA base modifications and derived biological functions not yet discovered? Given the advances in metabolomics, isotopic labeling, and sensitive instrumentation, perhaps new DNA modifications will be detected and tracked.
Second, several precedents suggest reevaluation of the scope of reactions catalyzed by known DNA cytosine-modifying enzymes. TET enzymes may catalyze other oxidations, and TDG might excise other modified nucleotides. DNMT enzymes are now known to catalyze the addition of aldehyde moieties, not just a methyl group—might it do more?
Third, there should be more bioinformatics-guided searching for novel enzymes that modify DNA, such as the carboxylase for 5caC, and also more traditional biochemical approaches using DNA containing modified nucleobases.
Finally, and perhaps most importantly, there is a need for novel chemical biology tools to detect site-specific modifications, akin to what has been done already for DNA methylation patterns using, for example, bisulfite sequencing. This critical need has already been addressed by Korlach and coworkers in the case of 5hmC by strand-specific, base-resolution detection of 5hmC in genomic DNA with single-molecule sensitivity, combining a bioorthogonal, selective chemical labeling method of 5hmC with single-molecule, real-time (SMRT) DNA sequencing.
Oh, let’s not forget about RNA!
Chuan He’s article in Nature Chem. Biol. in 2010 is entitled Grand Challenge Commentary: RNA epigenetics? Therein he discusses examples of RNA modification and demodification that may impact biological regulation. These include RNA base methylation and dioxygenases that use iron, α-ketoglutarate and dioxygen to perform oxidation of modified RNA bases for demethylation or hypermodification. He posits that post-transcriptional RNA modifications can be dynamic and might have functions beyond fine-tuning the structure and function of RNA. Furthermore, understanding these RNA modification pathways and their functions may allow researchers to identify new layers of gene regulation at the RNA level.
I certainly agree. Do you? As always, your comments are welcomed.