In Search of RNA Epigenetics: A Grand Challenge

  • Methylated riboA and riboC are the most commonly detected nucleobases in epigenetics research
  • Powerful new analytical methods are key tools for progress
  • Promising PacBio sequencing and novel “Pan Probes” reported   

In a Grand Challenge Commentary published in Nature Chemical Biology in 2010, Prof. Chuan He at the University of Chicago opined that “[p]ost-transcriptional RNA modifications can be dynamic and might have functions beyond fine-tuning the structure and function of RNA. Understanding these RNA modification pathways and their functions may allow researchers to identify new layers of gene regulation at the RNA level.”

Like other scientists who get hooked by certain Grand Challenges, I became fascinated by this possibility of yet “new layers” of genetic regulation involving RNA, either as conventional messenger RNA (mRNA) or more recently recognized long noncoding RNA (lncRNA). Part of my intellectual stimulation was related to the fact that some of my past postings have dealt with both lncRNA as well as recent advances in DNA epigenetics, so the notion of RNA epigenetics seemed to tie these together.

After doing my homework on recent publications related to possible RNA epigenetics, it became apparent that this posting could be logically divided into commentary on the following three major questions: what are prevalent epigenetic RNA modifications, what might these do, and where is the field going? Future directions were addressed by interviews with two leading investigators: Prof. Chuan He, who is mentioned above, and Prof. Tao, who has been involved in cutting edge methods development.

RNA Epigenetic Modifications

More than 100 types of RNA modifications are found throughout virtually all forms of life. These are most prevalent in ribosomal RNA (rRNA) and transfer RNA (tRNA), and are associated with fine tuning the structure and function of rRNA and tRNA. Comments here will instead focus on mRNA and lncRNA in mammals, wherein the most abundant—and far less understood—modifications are N6-methyladenosine (m6A) and 5-methylcytidine (m5C).

structures

Three Approaches to Sequencing m6A-Modified RNA

Discovered in cancer cells in the 1970s, m6A is the most abundant modification in eukaryotic mRNA and lncRNA. It is found at 3-5 sites on average in mammalian mRNA, and up to 15 sites in some viral RNA. In addition to this relatively low density, specific loci in a given mRNA were a mixture of unmodified- and methylated-A residues, thus making it very difficult to detect, locate, and quantify m6A patterns. Fortunately, that has changed dramatically with the advent of various high-throughput “deep sequencing” technologies, as well as other advances.

(1.) Antibody-based m6A-seq 

An impressive breakthrough publication in Nature in 2012 by a group of investigators in Israel reported novel methodology called m6A-seq for determining the positions of m6A at a transcriptome-wide level. This approach, which is a variant of methylated DNA immunoprecipitation (MeDIP or mDIP), combines the high specificity of an anti-m6A antibody with Illumina’s massively parallel sequencing of randomly fragmented transcripts following immunoprecipitation. These researchers summarize their salient findings as follows.

“We identify over 12,000 m6A sites characterized by a typical consensus in the transcripts of more than 7,000 human genes. Sites preferentially appear in two distinct landmarks—around stop codons and within long internal exons—and are highly conserved between human and mouse. Although most sites are well preserved across normal and cancerous tissues and in response to various stimuli, a subset of stimulus-dependent, dynamically modulated sites is identified. Silencing the m6A methyltransferase significantly affects gene expression and alternative splicing patterns, resulting in modulation of the p53 (also known as TP53) signaling pathway and apoptosis. Our findings therefore suggest that RNA decoration by m6A has a fundamental role in regulation of gene expression.”

Moreover, their concluding sentence refers back to He’s aforementioned Grand Challenge Commentary about RNA epigenetics in 2010, just two years earlier.

“The m6A methylome opens new avenues for correlating the methylation layer with other processing levels. In many ways, this approach is a forerunner, providing a reference and paving the way for the uncovering of other RNA modifications, which together constitute a new realm of biological regulation, recently termed RNA epigenetics.”

(2.) Promising PacBio Single-Molecule Real-Time (SMRT) Sequencing of m6A

In a previous post, I praised PacBio (Pacific Biosciences) for persevering in development of its SMRT sequencing technology that uniquely enables, among other things, direct sequencing of various types of modified DNA bases via differentiating the kinetics of incorporating labeled nucleotides. Attempts to extend the SMRT approach to sequencing m6A have been recently reported by PacBio in collaboration with Prof. Pan (see below) and others in J. Nanobiotechnology in April 2013. Using model synthetic RNA templates and HIV reverse transcriptase (HIV-RT) they demonstrated adequate discrimination of m6A from A, however, “real’ RNA samples having complex ensembles of tertiary structures proved to be problematic. Alternative engineered RTs that are more processive and accommodative of labeled nucleotides were said to be under investigation in order to provide longer read lengths and appropriate incorporation kinetics.

The authors are optimistic in being able to solve these technical problems, and concluded their report by stating:

  “[w]e anticipate that the application of our method may enable the identification of the location of many modified bases in mRNA and provide detailed information about the nature and the dynamic RNA refolding in retroviral/retro-transposon reverse transcription and in 3’-5’ exosome degradation of mRNA.”

Let’s hope that this is achieved soon!

(3.) Nanopore Sequencing of m6A?

It’s too early to be sure, but continued incremental advances in possible approaches to nanopore sequencing suggest applicability to m6A. As pictured below, Bayley and coworkers describe a method that uses ionic current measurement to resolve ribonucleoside monophosphates or diphosphates (rNDPs) in α-hemolysin protein nanopores containing amino-cyclodextrin adapters.

Taken from Bayley and coworkers in Nano Lett. (2013)

Taken from Bayley and coworkers in Nano Lett. (2013)

The accuracy of base identification is further investigated through the use of a guanidino-modified adapter. On the basis of these findings, an exosequencing approach for single-stranded RNA (ssRNA) is envisioned in which a processive exoribonuclease (polynucleotide phosphorylase, PNPase) presents sequentially cleaved rNDPs to a nanopore. Although extension of this concept to include m6A has yet to be demonstrated, earlier feasibility studies by Ayub & Bayley have shown discrimination of m6A (and other modified bases) from unmodified ribobases.

Two Probe-Based Methods for Detecting Specific m6A Sites

1.) “Pan Probes”

As the saying goes, “what goes around comes around”, and in this instance its repurposing 2’-O-methyl (2’OMe) modified RNA/DNA/RNA oligos. This general class of chemically synthesized chimeric “gapmers” was originally used for RNase H-mediated cleavage of mRNA in antisense studies. Very recently, however, Pan and coworkers have cleverly adapted these probes—which I like to alliteratively refer to as “Pan Probes”—to m6A detection in mRNA and lncRNA.

For details see SCARLET workflow; taken from Pan and coworkers RNA (2013)

Pan Probes are comprised of “7-4-7 gapmers” having seven 2’OMe RNA nucleotides flanking four DNA nucleotides, the latter of which straddle known (or suspected) m6A sites, as depicted in the cartoon shown. The indicated series of steps, which involve site-specific cleavage and radioactive-labeling followed by ligation-assisted extraction and thin-layer chromatography, is thankfully called SCARLET by these investigators.

SCARLET was used by Pan and coworkers to determine the m6A status at several sites in two human lncRNAs and three human mRNAs, and found that the m6A fraction varied between 6% and 80% among these sites. However, they also found that many m6A candidate sites in these RNAs were not modified. Obviously, while much more work needs to be done to collect data for deciphering dynamic patterns and implications of m6A RNA epigenetic modifications, these investigators note that SCARLET is, in principle, applicable to m5C, pseudouridine, and other types of epigenetic RNA modifications.

Readers interested in designing and investigating their own Pan Probes can obtain these 7-4-7 gapmers by using TriLink’s OligoBuilder® and simply selecting “PO 2’OMe RNA” from the Primary Backbone dropdown menu, typing the first 7 bases in the Sequence box, selecting the 4 DNA bases from the Chimeric Bases menu and then typing the remaining 7 2’OMe RNA bases.

(2.) Probes for High-Resolution Melting

In a new approach very recently reported by Golovina et al. at Lomonosov Moscow State University, the presence of m6A in a specific position of mRNA or lncRNA molecule is detected using a variant of high-resolution melting (HRM) analysis applicable to, for example, single-nucleotide genotyping. The authors suggest that this method lends itself to screening many samples in a high-throughput assay following initial identification of loci by sequencing (see above). The method uses two labeled probes—one with 5’-FAM and another with 3’-BHQ1 (both available from Trilink’s OligoBuilder®)—that hybridize to a particular query position in a total RNA sample, as shown below for a 23S rRNA model system. The presence of m6A lowers the melting temperature (Tm), relative to A, with a magnitude that is sequence-context dependent.

Taken from Golovina et al. Nucleic Acids Res. (2013).

Taken from Golovina et al. Nucleic Acids Res. (2013).

The authors studied various probe-target constructs, and recommend 12–13-nt-long probes containing a quencher, and >20-nt long probes containing a fluorophore.  They also could advise that the quencher-containing oligonucleotide hybridizes to RNA such that m6A be directly opposite the 3′-terminal nucleotide carrying the quencher. The authors point out that relatively low-abundant, non-ribosomal targets need partial enrichment by, for example, simple molecular weight-based purification or commercially available kits. In this regard, they estimate that, if a particular type of mRNA was present at 10,000 copies per mammalian cell, 107 cells would be required to analyze m6A by this HRM method.

m5C Analysis by Sequencing of Bisulfite-Converted RNA

Selective reaction of bisulfite with C but not m5C in RNA, analogous to that long used for DNA, provides the basis for determining C-methylation status by sequencing. As detailed by Squires et al. in Nucleic Acids Res. in 2013, bisulfite-converted RNA can be sequenced by either of two methods: conversion to cDNA, cloning, and conventional sequencing, or conversion to a next-generation sequencing library. These authors described their salient findings as follows.

“We confirmed 21 of the 28 previously known m5C sites in human tRNAs and identified 234 novel tRNA candidate sites, mostly in anticipated structural positions. Surprisingly, we discovered 10,275 sites in mRNAs and other non-coding RNAs. We observed that distribution of modified cytosines between RNA types was not random; within mRNAs they were enriched in the untranslated regions and near Argonaute binding regions… Our data demonstrates the widespread presence of modified cytosines throughout coding and non-coding sequences in a transcriptome, suggesting a broader role of this modification in the post-transcriptional control of cellular RNA function.”

“Writing, Reading, and Erasing” RNA Epigenetic Modifications

Enzyme-mediated post-transcriptional RNA methylation (aka “writing”) and demethylation (aka “erasing”) are critical processes to identify and fully characterize in order to elucidate RNA epigenetics, and are formally analogous to those operative for DNA epigenetics.

RNA epigenetic “writing” mechanisms have focused on N6-adenosine-methyltransferase 70 kDa subunit, an enzyme that in humans is encoded by the METTL3 gene, and is involved in the posttranscriptional methylation of internal adenosine residues in eukaryotic mRNAs to form m6A. According to Squires et al., two m5C methyltransferases in humans, NSUN2 and TRDMT1, are known to modify specific tRNAs and have roles in the control of cell growth and differentiation.

As for “erasing”, in 2011, He’s lab discovered the first RNA demethylase, abbreviated FTO, for fat mass and obesity-associated protein, which has efficient oxidative demethylation activity targeting m6A in RNA in vitro. They also showed for the first time that this erasure of m6A could significantly affect gene expression regulation. In 2013, He’s lab discovered the second mammalian demethylase for m6A, ALKBH5, which affects mRNA export and RNA metabolism, as well as the assembly of mRNA processing factors, suggesting that reversible m6A modification has fundamental and broad functions in mammalian cells.

So, if Mother Nature evolved these mechanisms for writing and erasing RNA epigenetic modifications, what about the equally important, in between process of “reading” them? He and Pan and collaborators have very recently reported insights to such reading. They showed that m6A is selectively recognized by the human YTH domain family 2 (YTHDF2) “reader” protein to regulate mRNA degradation. They identified over 3,000 cellular RNA targets of YTHDF2, most of which are mRNAs, but also include non-coding RNAs, with a conserved core motif of G(m6A)C. They further establish the role of YTHDF2 in RNA metabolism, showing that binding of YTHDF2 results in the localization of bound mRNA from the translatable pool to mRNA decay sites. The carboxy-terminal domain of YTHDF2 selectively binds to m6A-containing mRNA, whereas the amino-terminal domain is responsible for the localization of the YTHDF2–mRNA complex to cellular RNA decay sites. These findings, they say, indicate that the dynamic m6A modification is recognized by selectively binding proteins to affect the translation status and lifetime of mRNA.

Expert Opinions of the Future for RNA Epigenetics

As I’ve said here before, there is no crystal ball for accurately predicting the future in science, although scientists do enjoy imagining that there is. Opinions of two “hands on” experts in the emerging field of RNA epigenetics are certainly of interest in this regard. Below are some comments offered by the aforementioned Prof. Tao Pan and Prof. Chuan He provided via an email interview in which I posed the question, ‘What do you see as the most important developments for RNA epigenetics?’ These experts have  thrown down the gauntlet, so to speak, by asserting RNA epigenetics as a Grand Challenge.

Prof. Tao Pan

Prof. Tao Pan

“In my opinion, the biggest current challenge for the field is to develop methods that can perturb m6A modification at specific sites to assess m6A function directly in specific genes. RNA interference or overexpression of an mRNA may simply decrease or increase modified and unmodified RNA alike. In a few cases, mutation of a known m6A site in an mRNA resulted in additional modification at a nearby consensus site, so that one cannot simply assume that mutation of a known site would not lead to cryptic sites nearby that may perform the same function. Further, functional understanding of a specific site should also take into account that all currently known m6A sites in mRNA and viral RNA are incompletely modified, so that one may need to explain why cells simultaneously maintain two RNA species that differ only at the site of m6A modification.”   

Prof. Chuan He

Prof. Chuan He

The m6A modification is much more abundant than other RNA modifications in mammalian and plant nuclear RNA and is currently the only known reversible RNA modification. The m6A maps of various organisms/cell types need to be obtained. High-resolution methods to obtain transcriptome-wide, base-resolution maps are important. A future focus should be to connect the reversible m6A methylation with functions, in particular, the studies of the reader proteins that specifically recognize m6A and exert biological regulation. The first example of the YTHDF2 work just published in Nature (above) is a good example. We believe many other reader proteins exist and impact almost all aspects of mRNA metabolisms or functions of lncRNA. 

Besides m6A, there are m5C, pseudoU, 2′-OMe, and potentially other modifications in mRNA and various non-coding RNAs (such as the recently discovered hm6A and f6A). The methods to map these modifications (except m5C) need to be developed and their biological functions need to be elucidated. 

Lastly, potential reversal of rRNA and tRNA modifications needs to be studied. As I stated in the Commentary in 2010, dynamic RNA modifications could impact gene expression regulation resembling well-known dynamic DNA and histone modifications. I think now we have enough convincing data to indicate this is indeed the case. The future is bright.”

Very bright, indeed! Your comments about this posting are welcomed.