Curiously Circular RNA (circRNA) Gets Curiouser

  • circRNA Molecules Have, Oddly, No Beginning or End
  • circRNA Are Now Recognized as Regulators of Gene Expression 
  • A Flurry of New Findings Indicate circRNA Are Also Templates for Synthesis of Proteins Having As Yet Unknown Functions

Electron micrograph of ~3,000-nt circRNA. Taken from Matsumoto et al. PNAS (1990).

About a year ago, my blog titled Curiously Circular RNA pointed out that circular RNA (circRNA) in animals are odd molecules in that, unlike the vast majority of other RNA in animals, circRNA have no structural beginning (5’) or end (3’). This very curious feature has, not surprisingly, stimulated considerable scientific interest in knowing more about these molecules, which were serendipitously discovered some 30 years ago.

Application of next-generation sequencing has revealed that circRNA are actually relatively abundant and evolutionarily conserved, which implicates biological importance rather than inconsequential mistakes during RNA splicing mechanisms. Some circRNA have been shown to have function—circRNA can hybridize to complementary microRNA (miRNA), and thus serve as a kind of ‘sponge’ that influences miRNA-based gene expression. Evidence for circRNA involvement in gene expression continues to grow, as there are now >700 items on “circRNA [and] sponges” in Google Scholar.

Very recently published lines of research (that I’ll outline in what follows) implicate circRNA as coding templates for proteins, which heretofore has been exclusively associated with messenger RNA (mRNA). Current dogma holds that translation of mRNA into protein requires recognition of the 7-methylguanylated (m7G) 5’-cap structure to start ribosome binding, while the 3’-poly(A) tail protects the mRNA molecule from enzymatic degradation and aids in stopping translation, as depicted below.

Taken from Shoemaker & Green Nature Structural & Molecular Biology (2012).

Start and stop structural elements characteristic of mRNA are obviously not present in circRNA, which are literally just circles of RNA. Consequently, finding proteins encoded by circRNA has stirred up controversy about whether such proteins are a new and fundamentally important aspect of genetics or just inconsequential biochemical mistakes.

Translation of circRNA in Fly Head Neurons

Fruit fly. Taken from turbosquid.com

Researchers at The Hebrew University of Jerusalem in Israel in collaboration with a team at Max-Delbruck-Center for Molecular Medicine in Berlin, Germany recently reported in Molecular Cell the first compelling evidence that a subset of circRNA is translated in vivo. The study by Kadener & coworkers was carried out using the common fruit fly (Drosophila melanogaster), which is known to have a number of features that lend to investigations of circRNA: (1) >2,500 fruit fly circular RNAs have been rigorously annotated, (2) these are mostly derive from back-splicing (pictured below) of protein-coding genes, (3) hundreds of which are conserved across multiple Drosophila species, and (4) exhibit commonalities to mammalian circRNA.

Direct back-splicing: a branch point in the 5’ intron attacks the splice donor of the 3’ intron. The 3’ splice donor then completes the back-splice by attacking the 5’ splice acceptor forming a circRNA. Taken from Jeck & Sharpless Nature Biotechnol (2014).

This study by Kadener & coworkers involves a plethora of technically complex experimental procedures and associated jargon, from which I’ve extracted what I believe to be some key points to share. After annotating the Drosophila circRNA open reading frames (cORFs), which, by definition,h have the potential for translation, they searched for evidence of their translation utilizing previously published ribosome footprinting (RFP). This led to identification of 37 circRNAs with at least one specific RFP read, referred to as ribo-circRNAs.

Taken from Jeck & Sharpless Nature Biotechnology (2014)

Several representative ribo-circRNAs were then constructed to each have (pictured below) a metallothionine (MT) promoter and V5 tag to facilitate translation and anti-V5 antibody-based detection of the expected protein after transfection into cells.

To determine whether circRNAs are translated in a more relevant tissue, they set up the RFP methodology in fly heads. A genetic locus named mbl that is known to produce a circRNA (circMbl3) at high abundance was selected for targeted mass spectrometry from a fly head immunoprecipitated MBL. They utilized synthetic peptides to determine characteristic spectra for which to search in the fly head immunoprecipitate and found a consistent and very high confidence hit for a peptide that can only be produced by circMbl3.

Kadener & coworkers extended these fly head findings to mammalian mouse and rat systems, but the most interesting part of this study—in my opinion—dealt with what signals ribosome binding and translation in the absence of the 5’ cap structure present in mRNA. They demonstrated circRNA translation under conditions intended to block normal 5’ cap-dependent translation of mRNA, and concluded that “[untranslated regions] of ribo-circRNAs (cUTRs) allow cap-independent translation [and that] further research is necessary to uncover how these sequences promote translation.”

Remarkably, as you’ll now read, another group of investigators have apparently found how such promotion of circRNA translation can occur.

Translation of circRNA is Driven by N6-Methyladenosine (m6A)

The most abundant modification of RNA in eukaryotes is m6A, which has been recently shown by Li et al. to recruit binding proteins that collectively facilitate the translation of specifically targeted mRNAs—i.e. those “marked” with m6A—through interactions with 40S and 60S ribosome subunit “machinery” that actually carry out translation. Contemporaneously, Yang et al. found that m6A likewise promotes efficient initiation of protein translation from circRNAs in human cells. They discovered that consensus m6A motifs are enriched in circRNAs, and a single m6A site is sufficient to drive translation initiation.

As depicted below, this m6A-driven translation requires initiation factor F4G2 and m6A “reader” YTHDF3. Experiments showed that this translation is enhanced by methyltransferase METTL3/14 and inhibited by demethylase FTO, which enzymatically “add” and “subtract” methyl (Me) groups on specific adenosines (A) in circRNAs, respectively.  It has also been shown to be upregulated upon heat shock, which is a commonly employed method to induce “stress” in cells.

Taken from Yang et al.

Further analyses through polysome profiling, computational prediction and mass spectrometry revealed that m6A-driven translation of circRNAs is widespread, with hundreds of endogenous circRNAs having translation potential. Yang et al. concluded by stating that their “study expands the coding landscape of [the] human transcriptome, and suggests a role of circRNA-derived proteins in cellular responses to environmental stress.”

Zinc Finger Protein in Muscle Cell Development

Finally, and essentially contemporaneously with above mentioned two publications, a third independent investigation reported by Legnini et al. demonstrated selective circRNA downregulation using short-interfering RNAs (siRNAs). These reagents for RNA interference (RNAi) were used in an image-based functional genetic screen of 25 circRNA species, conserved between mouse and human, expression of which are differentially expressed during myogenesis (i.e. formation of muscular tissue) in Duchenne muscular dystrophy myoblasts.

This siRNA/RNAi-based functional analysis provided one interesting case related to zinc finger protein 609 (circ-ZNF609)—a reported miRNA sponge—the phenotype of which could be specifically attributed to the circular form and not to the linear mRNA counterpart. Consistent with the circ-ZNF609 sequence having an ORF, they found that a fraction of circ-ZNF609 RNA is loaded onto polysomes and that, upon puromycin treatment, it shifted to lighter fractions, similar to mRNAs. The coding ability of this circRNA was proved through use of artificial constructs expressing circular tagged transcripts, and by CRISPR/Cas9—the trendy gene editing method about which I’ve already commented multiple times.

Despite all this evidence, Legnini et al. stated that they “have no hints on the molecular activity of the proteins derived from circ-ZNF609 and as to whether they contribute to modulate or control the activity of the counterpart deriving from the linear mRNA.”

In thinking about closing comments about this update in circRNA, I decided to emphasize that investigations in the field of RNA continue to reveal complexities that will require many more years of global attention to unravel and understand. In just the past decade or so we’ve learned about gene regulation by miRNA/siRNA, reclassification of “junk DNA” as encoding a myriad of long noncoding RNA (lncRNA), mRNA regulation by base-modifications, and curious circRNAs that are more than sponges, and likely encode hundreds (if not thousands) of proteins whose functions have yet to be elucidated. Amazing!

What are your thoughts about all of this?

Your comments are welcomed.

Postscript

After writing this blog, Panda et al. at the National Institute on Aging-Intramural Research Program, National Institutes of Health published a paper titled High-purity circular RNA isolation method (RPAD) reveals vast collection of intronic circRNAs. Here’s a snippet of the abstract which adds to the increasingly curious occurrence of circRNAs that begs, if you will, further research aimed at discovering functions of circRNA-derived proteins.

“Here, we describe a novel method for the isolation of highly pure circRNA populations involving RNase R treatment followed by Polyadenylation and poly(A)+ RNA Depletion (RPAD), which removes linear RNA to near completion. High-throughput sequencing of RNA prepared using RPAD from human cervical carcinoma HeLa cells and mouse C2C12 myoblasts led to two surprising discoveries: (i) many exonic circRNA (EcircRNA) isoforms share an identical backsplice sequence but have different body sizes and sequences, and (ii) thousands of novel intronic circular RNAs (IcircRNAs) are expressed in cells. In sum, isolating high-purity circRNAs using the RPAD method can enable quantitative and qualitative analyses of circRNA types and sequence composition, paving the way for the elucidation of circRNA functions.”

DNA Day 2017

  • There are Now Millions of DNA-Related Publications
  • Some of the Top 5 Cited Papers on DNA Will Surprise You
  • You Probably Won’t Guess Top 5 Most Frequently Cited

Deciding what to post here in recognition of DNA Day 2017 was just as challenging as it has been in past years, primarily because there’s so many different perspectives from which to choose. After much mulling, and several abandoned approaches, I settled on featuring DNA publications that have received the most citations, as an objective metric—not just my subjective opinions about topics I think are significant or otherwise interesting.

Before getting to the numbers of DNA-related papers and some of the most cited papers, here’s a quick recap of what was posted here in the past, starting with the inaugural blog four years ago:

2013—60th Anniversary of the Discovery of DNA’s Double Helix Structure

2014—My Top 3 “Likes” for DNA Day

2015—Celebrating Click Chemistry in Honor of DNA Day

2016—DNA Dreams Do Come True!

Explosive Growth of DNA Publications

Regular readers of my blogs will know that I frequently use the NIH PubMed database of scientific articles to find publications by searching keywords, phrases, or authors. A convenient feature of these searches is providing “results per year” that can be exported into Excel for various purposes. Some preliminary searches indicated that DNA-related articles can be indexed by either DNA or PCR, or cloning, or other terms among which sequencing was notable. The majority, however, were indexed as either DNA or PCR, which together gave nearly 1.7 million items—an astounding number. This number is even much greater since PubMed excludes some important chemistry journals, as well as patents.

Diving deeper into these numbers, I thought it helpful to look at the publication volumes and rates for DNA, sequencing DNA, and PCR through 2015 starting from 1953, 1977, and 1986, respectively. These respective dates correspond to seminar publications by Watson & Crick, Maxam & Gilbert, and Mullis & coworkers. The results shown in the following graph attest to my often stated “power of PCR” as premier method in nucleic acid research, which we’ll see again below in another numerical context.

Top 5 Cited Papers

During my perusal of the above literature in PubMed generally related to DNA, I thought it would be interesting to find, and share here, which specific papers have the distinction of being most frequently cited. Citations are not available in PubMed, but are compiled in Google Scholar, which led me to these Top 5 that are listed from first to fifth.

Frederick Sanger (1918-2013) Taken from newscientist.com

  1. DNA sequencing with chain-terminating inhibitors

Frederick Sanger, the eponymous father of the “Sanger sequencing” method published in 1977, received the 1980 Nobel Prize in chemistry for this contribution. He also received the 1958 Nobel Prize in chemistry for sequencing insulin, and is the only person to win two Nobel Prizes in chemistry. Uber-famous DNA expert Craig Venter is quoted as saying that ‘Fred Sanger was one of the most important scientists of the 20th century,’ [who] ‘twice changed the direction of the scientific world.’

  1. Analysis of relative gene expression data using real-time quantitative PCR and the 2− ΔΔCT method

Kenneth J. Livak, PhD
Taken from archive.sciencewatch.com

The most commonly used method to analyze data from real-time, quantitative PCR (RT-qPCR) experiments is relative quantification, which relates the PCR signal of the transcript of interest to that of a control sample such as an

untreated control. The derivation, assumptions, and applications of this method were published in 2001 by Livak & Schmittgen. I overlapped with Ken Livak at Applied Biosystems, which pioneered commercilaization of RT-qPCR reagents and instrumentation at the time. He is currently Senior Scientific Fellow at Fluidigm Corp.

Sir Edwin M. Southern Taken from ogt.co.uk

3. Detection of specific sequences among DNA fragments separated by gel electrophoresis

Sir Edwin Mellor Southern, FRS, the eponymous father of “Southern blotting” DNA fragments from agarose gels to cellulose nitrate filters published in 1975, is a Lasker Award-winning molecular biologist, Emeritus Professor of Biochemistry at the University of Oxford and a fellow of Trinity College. He is also Founder and Chief Scientific Advisor of Oxford Gene Technology.

  1. Prof. Bert Vogelstein, MD
    Taken from hhmi.org

    A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity

This paper by Feinberg & Vogelstein published in 1983 describes how to conveniently radiolabel DNA restriction endonuclease fragments to high specific activity using the large fragment of DNA polymerase I and random oligonucleotides as primers. These “oligolabeled” DNA fragments serve as efficient probes in filter hybridization experiments. His group pioneered the idea that somatic mutations represent uniquely specific biomarkers for cancer patients, leading to the first FDA-approved DNA mutation-based screening tests, and now “liquid biopsies” that evaluate blood samples to obtain information about underlying tumors and their responses to therapy (an area that I’ve touted in previous blogs). A technique for conveniently radiolabeling DNA restriction endonuclease fragments to high specific activity is described. DNA fragments are purified from agarose gels directly by ethanol precipitation and are then denatured and labeled with the large fragment of DNA polymerase I, using random oligonucleotides as primers. Over 70% of the precursor triphosphate is routinely incorporated into complementary DNA, and specific activities of over 109 dpm/μg of DNA can be obtained using relatively small amounts of precursor. These “oligolabeled” DNA fragments serve as efficient probes in filter hybridization experiments.

  1. Kary B. Mullis, PHD
    Taken from TED.com

    Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase

In 1988, Kary B. Mullis and coworkers (then at Cetus Corp.) published in venerable Science a method using oligonucleotide primers and thermostable DNA polymerase from Thermus aquaticus to amplify genomic DNA segments up to 2000 base pairs to detect a target DNA molecule present only once in a sample of 105 cells. Since that time, polymerase chain reaction (PCR)-related technology has evolved to now routinely enable a variety of single-cell analyses of DNA or RNA. Dr. Mullis received the 1993 Nobel Prize in chemistry for his 1983 invention of PCR, which his website says ‘is hailed as one of the monumental scientific techniques of the twentieth century.’

Top 5 Papers by Citation Frequency

While writing the above section, it occurred to me that ranking these five publications by total number of citations-to-date in Google Scholar doesn’t account for differences in the number of years between the year of publication and now. I did the math to calculate the average citation frequency per year, and here’s the totally surprising—to me—result: relative gene expression methodology published by Livak & Schmittgen is by far the most frequently cited of the Top 5, according to this way of ranking:

  1. 2001, relative gene expression, Cited by 69560 = 4,637 avg. citations per year
  2. 1977, Sanger sequencing, Cited by 32662 = 1,701
  3. 1975, Southern blotting, Cited by 21201 = 796
  4. 1988, PCR, Cited by 18785 = 671
  5. 1983, oligolabeled DNA, Cited by 21200 = 642

I should point out that, as transformative methods such as these gradually become widely recognized as “standard procedures,” researchers tend to feel it unnecessary to include a reference to the orignal publication. Consequenly, citation frequency decreases with time even though cummulative usage increases. In other words, 25 years from now average citations per year for relative gene expression will have likely decreased, and be surpassed by a new “method of the decade,” so to speak.

Prediction for the Future

This line of reasoning leads me to close with some speculation about what DNA-related technique might emerge as the next “method of the decade” that tops the above ranking by citation frequency.

My guess is that it will be Multiplex genome engineering using CRISPR/Cas systems by Zhang & coworkers that has been cited by 4145 at the time I’m writing this piece, only four years from its publication in venerable Science in 2013. Some of my blogs have already commented on various aspects of CRISPR/Cas9, which is among genome editing tools offered by TriLink.

As usual, your comments are welcomed.

Autism Awareness Month – April 2017

  • Sequencing for Diagnosis of Autism Holds Promise
  • Several Genetic-Risk Testing Procedures are Available
  • More Than 40 Autism Publications Using TriLink Products

The first National Autism Awareness Month was declared by the Autism Society in April 1970 with the aim of educating the public about autism. Autism is a complex mental condition and developmental disability, characterized by difficulties in the way a person communicates and interacts with other people. Autism can be present from birth or form during early childhood, typically within the first three years. Autism is a lifelong developmental disability with no single known cause.

The puzzle pattern of this ribbon reflects the complexity of autism, while the colors and shapes represent the diversity of people and families living with this spectrum of disorders. Taken from drdiane.com

People with autism are classed as having Autism Spectrum Disorder (ASD) and the terms autism and ASD are often used interchangeably. The term “spectrum” refers to the wide range of symptoms, skills, and levels of disability in functioning that can occur in people with ASD, which includes Asperger syndrome. Some children and adults with ASD are fully able to perform all activities of daily living while others require substantial support to perform basic activities. ASD occurs in every racial and ethnic group, and across all socioeconomic levels. However, boys are significantly more likely to develop ASD than girls. The latest analysis from the U. S. Centers for Disease Control and Prevention (CDC) estimates that 1 in 68 children has ASD.

Taken from myaspergerschild.com

According to the CDC, diagnosing ASD can be difficult, since there is no medical test, like a blood test, to diagnose the disorders. Doctors look at the child’s behavior and development to make a diagnosis.” More details from the CDC are provided at this link.

Notwithstanding this current difficulty for diagnosis of ASD, research has led to continuing progress toward possible blood tests for ASD, which is the focus of this blog and supplements an earlier posting here on treating autism with a broccoli nutraceutical.

ASD and Exome Sequencing

My Google Scholar search for “autism and sequencing” led to a mindboggling list of more than 47,000 items! When ordered by relevance rather than date of publication, two publications were each cited ~1,000-times following “back-to-back” appearance in venerable Nature magazine in 2012. This computes to a combined average of ~400 citations per year, or a fraction more than one citation per day on average, which to me signals significant attention by the ASD research community and thus worth commenting on herein.

Sanders et al., in the first of these two widely cited studies, carried out exome sequencing in 238 families wherein each pair of parents was unaffected by ASD but had a child who was affected (aka proband), and in 200 of these families there was an unaffected sibling. This study design feature is important in view of the widely held idea that complex personality traits are derived by a combination of “nature and nurture,” i.e. genetics inherited from parents and that which is learned or otherwise acquired by familial and all external events.

Before synopsizing what was found, I should note that germline single-base mutations spontaneously arise during mitosis in every generation, and are termed de novo single nucleotide variants (SNVs). Identifying SNVs remained refractory to analysis at the whole genome or exon level until the advent of next-generation sequencing (NGS) technologies.

Sanders et al. found that the total number of non-synonymous (i.e. changes in the amino acid sequence of proteins) de novo SNVs—particularly highly disruptive nonsense and splice-site de novo mutations—are associated with ASD. They concluded that their results “substantially clarify the genomic architecture of ASD, demonstrate significant association of three genes—SCN2A, KATNAL2 and CHD8—and predict that approximately 25–50 additional ASD-risk genes will be identified as sequencing [more] families is completed.”

Neale et al., in the second widely cited study, likewise conducted exome sequencing but on only 175 ASD probands and their parents. Nevertheless, they found that the proteins encoded by genes that harbored de novo non-synonymous or nonsense mutations showed a higher degree of connectivity among themselves and with previous ASD genes as indexed by protein-protein interaction screens. They concluded that their results “support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5- to 20-fold,” but did acknowledge the strong evidence reported by Sanders et al. for individual genes as risk factors.

ASD Genetic-Risk Testing

The American Academy of Pediatrics (AAP) in 2013 issued a statement on ethical and policy issues for genetic screening of children for ASD that was prompted in part due to then recent progress by IntegraGen—a small French genomics company—on development of a gene test that uses a cheek swab to screen infants and toddlers for 65 genetic markers associated with autism. Highlights of the AAP’s statement include:

  • Genetic screening can be particularly useful for diagnosing older babies and children with developmental disorders such as autism.
  • Genetic screening should be made available for all newborns. However, parents should have the right to refuse screening after being informed of the benefits and risks.
  • The decision to offer testing or screening should be based primarily on the best interest of the child.

Taken from autismspeaks.org

By way of an update, I’m pleased to add that in a 2015 press release by IntegraGen it was announced that its ARISk® Test became the first test marketed in the U. S. to assess the risk of autism spectrum disorder in children. Among the following IntegraGen statements about the ARISk® Test, I think it’s most important to note the caveats I’ve bolded for emphasis:

  • The test does not confirm or rule out a diagnosis of ASD for the child tested.
  • The test is intended to be used together with a clinical evaluation and other developmental screening tools.
  • Intended for children with early signs of developmental delay or ASD and in children who have older siblings previously diagnosed with an autism spectrum disorder.
  • A genetic score, based on the total number of genetic markers associated with autism identified, is used to estimate the child’s risk of developing ASD.
  • Intended for use for children 48 months and younger. The ARISk® Test is not available for prenatal testing.

Taken from integragen.com

More recently, Courtagen—cofounded by my former Life Technologies colleague Kevin McKernan (coinventor of SOLiD® NGS)—has commercialized its sequencing analyses for ASD and other neurodevelopmental conditions. According to a Courtagen posting, “[i]n the absence of a known single-gene disorder, ASD likely involves a complex combination of both genetic and environmental factors that influence early brain development. Multi-gene panels, such as Courtagen’s devSEEK® panels, provide clinicians with information on a number of genes commonly associated with ASD and autistic features. Clinicians can then use information from multi-gene panels to tailor treatments that meet the patient’s unique genotype and symptoms.”

Some interesting—to me—logistical and operational information about devSEEK® (237 genes) is as follows:

  • Turn-around time for results is 4-6 weeks.
  • DNA for sequencing is extracted from a single saliva sample. No blood draw or muscle biopsy required; however, blood and muscle tissue are accepted.
  • Courtagen works with patients, physicians, and insurance carriers to pre-approve each test. Courtagen will bill the insurance company and is willing to handle an appeal process as needed.
  • A secure physician online portal is available for ordering genetic tests and accessing patient reports when completed. Genetic counselors are available to address questions regarding Courtagen test results.

ASD Research and TriLink

While mulling over how to conclude this Autism Awareness Month blog featuring genetic testing for ASD, I wondered about TriLink’s role in advancing autism research by virtue of its various nucleic acid-related products being used for autism investigations. I was pleased and proud to find more than 40 items by searching Google Scholar for articles with the words “autism and TriLink.”

Perusal of these items revealed that the most cited (450-times) report was a 2012 publication in highly regarded Cell titled MeCP2 Binds to 5hmC Enriched within Active Genes and Accessible Chromatin in the Nervous System, which used TriLink 5-methyl-2′-deoxycytidine-5′-triphosphate (5m-dCTP). Given the apparent significance of this publication, I won’t try to give a short, simplified synopsis but rather quote the following part of the authors’ summary:

“We report that 5hmC [5-hydroxymethylcytosine] is enriched in active genes and that, surprisingly, strong depletion of 5mC [5-methylcytosine] is observed over these regions. The contribution of these epigenetic marks to gene expression depends critically on cell type. We identify methyl-CpG-binding protein 2 (MeCP2) as the major 5hmC-binding protein in the brain and demonstrate that MeCP2 binds 5hmC- and 5mC-containing DNA with similar high affinities. The Rett-syndrome-causing mutation R133C preferentially inhibits 5hmC binding. These findings support a model in which 5hmC and MeCP2 constitute a cell-specific epigenetic mechanism for regulation of chromatin structure and gene expression.”

I also noted a 2016 Cutting-Edge Review in Arteriosclerosis, Thrombosis, and Vascular Biology titled A CRISPR Path to Engineering New Genetic Mouse Models. These investigators utilized TriLink Cas9 mRNA for gene editing analogous to that reported by others for CRISPR/Cas9-mediated knockout of the autism gene CHD8 (see above). This led to transcriptomic profiling showing that CHD8 regulates multiple genes implicated in ASD pathogenesis and genes associated with brain volume.

In conclusion, I must say that I learned much new information about autism while researching this blog, which I hope you found informative as well as interesting. If so, I have achieved my goal of either increasing or reaffirming your awareness of autism, and the availability of genetic risk-assessment tests.

As usual, your comments here are welcomed.

Postscript

Recently, a team of academic researchers in Arizona made headlines with their publication in Microbiome reporting ties between autism symptoms and the composition and diversity of a person’s gut microbes, aka “gut microbiome,” about which I’ve commented on in several previous blogs.

The participants, who were 18 children with ASD (ages 7–16 years), underwent a 10-week treatment program involving antibiotics, a bowel cleanse, and daily fecal microbial transplants over 8 weeks. Remarkably, the new therapy seemed to provide some long-term benefits, including an 80% improvement of gastrointestinal symptoms associated with ASDs and roughly a 20% – 25% improvement in autism behaviors, including improved social skills and better sleep habits.

Click here for a simplified, educational video on this work by the principal investigator, Prof. James B. Adams at Arizona State University.

I should emphasize that this is a very small study, and much more research will be needed to verify and firmly establish possible benefits and risks. Interested readers should contact Prof. Adams regarding any questions they might have.

Evolving Polymerases to Do the Impossible

  • Polymerases Aren’t What They Used to Be! 
  • Scripps Team Evolves Polymerases That Read and Write With 2’-O-Methyl Ribonucleotides
  • Key Reagents for Romesberg’s “Molecular Moonshots” Are Supplied by TriLink BioTechnologies

Long-time devotees of these posts will likely remember a blog several years ago about Prof. Floyd Romesberg at the Department of Chemistry, The Scripps Research Institute who achieved a seemingly impossible feat. Namely, designing a new pair of complementary bases such that DNA replicating in E. coli would be comprised of six bases, thereby creating a six-base genetic code that is expanded from Nature’s four-base code.

Floyd E. Romesberg. Taken from utsandiego.com

More recently, Romesberg has cleverly outfoxed Nature once again, this time by evolving nucleic acid polymerases into mutant polymerases that can do what heretofore seemed impossible. He and his research team’s publication (Chen et al.) is a tour de force of experimental methodology that is not easily read, and is even harder to simply summarize in a short space like this blog. Consequently, I’ll first tell you what was accomplished, then give a short synopsis of principal new methodology, and close by commenting on the significance of this fascinating work.

Doing the Impossible

Romesberg’s lab successfully achieved what I think of as “multiple molecular moonshots,” wherein a Taq polymerase (which normally reads and writes DNA during PCR), was evolved by novel selection (SELEX) methods into mutant polymerases that are able to transcribe DNA into 2’-O-methyl (2’-OMe) RNA, and reverse transcribe 2’-OMe RNA into DNA for PCR/sequencing.

As depicted below, this was exemplified using a 60-mer DNA template and 18-mer 2’-OMe RNA primer to produce a fully-modified 48-mer 2’-OMe RNA by means of an evolved mPol and all four A, G, C and U 2’-OMe NTPs, which I’m proud to say were bought from TriLink BioTechnologies! This type of molecular evolution of a polymerase has no precedent.

DNA template   5’ ————————————- 3’

RNA primer                                 ←←← 3’ xxxxxx 5’

mPol ↓ 2’-OMe NTPs

Determining the fidelity of this seemingly impossible molecular transformation was addressed by achieving a feat of comparable impossibility! As depicted below, the aforementioned 48-mer 2’-OMe RNA product was hybridized to a DNA primer for reverse transcription into a 48-mer complementary DNA (cDNA) strand, using an evolved mPol, together with all four A, G, C and T unmodified dNTPS, which were also purchased from TriLink. This unprecedented conversion of 2’-OMe RNA into cDNA was followed by conventional PCR/sequencing, the results of which demonstrated relatively high fidelity.

2’-OMe template   5’ xxxxxxxxxxxxxxxxxxxxxxxxxxx 3’

DNA primer                                           ←←← 3’ —— 5’

mPol ↓ dNTPs

cDNA                        3’ ————————————– 5’

How They Did It

In the selection cycle shown below, (1) phage-display libraries were used to expose individual polymerases (Pol) on E. coli. cells in proximity to chemically attached primer/template complexes of interest, which are mixed with natural or modified triphosphates including biotin (green; B)-labelled UTP to extend the primer. (2) Phage that display active mutant polymerases (mPols) are isolated with streptavidin (SA) beads. After washing to remove nonspecific binders, phage cleaved from the beads are used to re-infect E. coli. (3) Heat-treated lysates of E. coli that express the recovered mPols are next subjected to plate-based screening using 96-well plates coated with primer/template complex and extension buffer that contained natural or modified triphosphates and B-UTP, incorporation of which is chromogenically detected. (4) Mutants that give rise to the most activity are selected for individual gel-based analysis, from which (5) promising candidates are selected for further diversification (e.g., by gene shuffling, as depicted) and then subjected to additional rounds of evolution.

Taken from Chen et al. Nature Chemistry (2017)

What is the Significance

In a previous blog, I’ve commented on increasing interest in the utility of aptamers, which are oligonucleotides that can specifically bind small molecules or motifs in proteins, and thus be used to build electronic sensors or studied as potential therapeutic agents rivaling antibodies. Therapeutic aptamers, like antisense oligonucleotides, require incorporation of chemical modifications to impart stability toward nucleases in blood or cellular targets.

Burmeister et al. have previously reported methods for mPol transcription of a DNA template into a fully modified, nuclease-resistant 23-mer 2’-OMe RNA aptamer—also using TriLink’s 2’-OMe NTPs! However, they encountered considerable experimental difficulties in generating this therapeutically promising 23-mer against vascular endothelial growth factor. These technical issues have now been surmounted by the mPol-evolution approaches in the present work by Romesberg’s team, which enabled improved access to longer 2’-OMe RNA aptamers with reasonable efficiency and fidelity.

Moreover, the present study is the first to evolve an mPol for reverse transcription of fully modified 2’-OMe RNA into DNA, which can then be amplified by PCR and/or sequenced, thereby opening the door for a variety of new analytical methods. Most importantly, the molecular mechanism by which these remarkable mPol activities was evolved, namely, the stabilization of an interaction between the “thumb and fingers domains,” may be general and thus useful for the optimization of other Pols. In that case, we can look forward to further advances in evolving other Pols to do the impossible—hopefully using modified nucleotide triphosphates from TriLink!

As usual, your comments are welcome.

CRISPR-Mediated Interference (CRISPRi) of Long Non-Coding RNA (lncRNA)

  • More Methodology from CRISPR Mania
  • lncRNA Function Blocked by CRISPRi
  • Mysteries of lncRNA Can Now be Deciphered by CRISPRi

This blog is about yet another example of a powerful new methodology spawned by intense scientific interest in using CRISPR-related technologies. This near mania for all things CRISPR is reflected by there being ~5,000 (!) publications already in PubMed only ~5 years after seminal papers appeared.

I chose the present blog topic because it involves use of CRISPR for genome-wide identification of functional long non-coding RNA (lncRNA) in human cells. In an earlier blog about lncRNA, which are now recognized to be regulators of gene expression encoded by what was originally defined as “junk” DNA, it was pointed out that it is inherently difficult to experimentally identify such regulation by lncRNA. Thanks to CRISPR this task is now much less daunting as you’ll learn below, following a couple of introductory sections to set the stage.

Repurposing CRISPR/Cas9 Using “Dead” Cas9

Qi et al. very cleverly—at least to me—recognized that the CRISPR/Cas9 system could be repurposed as an RNA-guided platform for sequence-specific control of gene expression by finding a catalytically inactive mutant Cas9 protein that lacked exonuclease (i.e. cutting) activity of wild-type Cas9, and instead blocked transcription by RNA polymerase (RNAP), as depicted below. These researchers coined the overall process as “CRISPR interference” (CRISPRi) and loosely referred to such a mutant Cas9 as “dead” Cas9 (dCas9).

Taken from Qi et al.

Interested readers are encouraged to consult this publication by Qi et al. to fully appreciate the extensive amount of work that went into translating the above concept into practice, and supporting the proposed mechanism of action. In my opinion, it’s a tour de force example of applying hypothesis-driven, state-of-the art molecular biology to devise a new method—in this case specifically blocking transcription of a DNA region using CRISPRi in conjunction with target-specific short guide RNA (sgRNA).

Adding Functionality to Down-Regulate Transcription

Just as organic chemists can design and synthesize small molecules having desired functional properties, molecular biologists can design and produce complex macromolecules having desired functional elements. The latter is nicely exemplified by Gilbert et el., who demonstrated that fusion of dCas9 to transcription factor effector domains having repressive regulatory functions enables efficient transcriptional repression in human (or yeast) cells via sgRNA that target genes of interest.

Taken from Liu et al. (2017)

As depicted below, Gilbert et al. used dCas9 fused to the Krüppel associated box (KRAB) domain, which is a transcriptional repression domain, and Green Fluorescent Protein (GFP) as a reporter gene targeted by sgRNA. They employed RNA-sequencing to quantify the transcriptome of GFP-positive HEK293 cells expressing dCas9-KRAB or a negative control construct. It was shown that CRISPRi is highly specific, as GFP was the only gene that was significantly suppressed by GFP-targeting sgRNA. Averaged data from two independent biological replicates indicated that no gene other than GFP changes by >1.5-fold.

Genome-Scale CRISPRi to Identify Human lncRNA

According to Liu et al., it has not been possible to predict which lncRNA loci are functional or what function they perform. Consequently, there is a need for large-scale, systematic approaches to interrogating the functional contribution of lncRNA loci. This sizeable team of collaborators from various institutions in the San Francisco Bay area, therefore, developed a genome-scale screening platform using CRISPRi with dCas9-KRAB and a library of sgRNA.

Taken from Liu et al. (2017)

As depicted below for the overall approach, they first designed a CRISPRi Non-Coding Library, which targets 16,401 lncRNA genes each with 10 sgRNAs per transcription start site. The required 170,262 sgRNAs were not synthesized chemically, but rather produced intracellularly by first using array-based sgDNA synthesis followed by clonal (i.e. individual sgDNA sequence) incorporation into lentivirus, which in turn were transfected into seven types of cells for screening. More detail on such lentiviral libraries is given in a Footnote at the end of this blog.

As indicated pictorially above, they applied this pooled screening approach to identify lncRNA genes that modify robust cell growth for induced pluripotent stem cells (iPSC) and six well-known, transformed human cell lines (K562, U87, etc.). This led to identification of 499 lncRNA loci that modified cell growth upon CRISPRi targeting.

Interestingly—at least to me—372 (~75%) and 299 (~60%) of these 499 growth-modifying lncRNA loci were distal to a protein coding gene (PCG) or mapped enhancer, respectively. The diagram below, taken from a review by Vance & Ponting, depicts “distal” effects of lncRNA away from PCG between two chromosomes (chr). What “triggers” transcription of the lncRNA from chr A and how it “finds” its cognate PCG on chr B are open and indeed intriguing questions.

Taken from Vance & Ponting (2014)

In addition to these high percentages of distal effects, Liu et al. found the following surprising results with regard to cell-type specificity of lncRNA function:

“Remarkably, 89% of the lncRNA gene hits modified growth in just one of the cell lines tested, and no hits were common to all seven cell lines. Although nearly all of the hit genes were expressed in the cell line in which they exhibited a growth phenotype, expression alone was insufficient to explain the cell type specificity of their function.”

“[Thus,] in contrast to recent studies that found that essential protein-coding genes typically are required across a broad range of cell types, we show that lncRNA function is highly cell type-specific, a finding that has important implications for their involvement in both normal biology and disease.”

Following are some of the major unanswered questions about lncRNA posed in a review I recommend reading for more background on lncRNA:

  • How does the manner in which lncRNAs are transcribed, processed, and regulated differ from that of other RNAs?
  • Are lncRNAs evolutionarily conserved, both in terms of their primary sequences and secondary structures?
  • Are all lncRNAs functional? Which ones have detectable biological functions in cells or in the whole organism?
  • Does the pervasive transcription that generates the lncRNA transcripts play a regulatory role distinct from the steady-state accumulation of the lncRNAs?
  • Can lncRNAs be exploited for clinical applications and therapeutics?

After reading this review, I thought to myself that there are many open questions about lncRNA but no comprehensive answers yet deciphered. When I then checked Google Scholar for items with both “deciphered” and “lncRNA” as terms, I found there were over 1,800 such items. Evidently, there are quite a few authors who, like me, view unknown functions of lncRNA as a cipher. I suspect that much of the now mysterious lncRNA function will eventually be deciphered thanks, in part, to the power of CRISPRi.

Your comments are welcomed.

Footnote

Readers interested in lentiviral sgRNA library construction and use for screening target cells can find general information at this website from which the following self-explanatory schematic provides a high-level overview of the workflow.

Taken from cellecta.com

Nanopore Sequencing by Synthesis (Seq-by-Syn)

  • Yet Another Notable Achievement Involving George Church, ‘The Most Interesting Scientist in the World’ 
  • Team of 30 Coauthors Reports Seq-by-Syn with DNA Polymerase-Nanopore Protein Construct on an Integrated Chip
  • Challenging Improvements Needed for Commercial Reality

Prof. George M. Church. Taken from evolutionnews.org

Devotees of my blog will know that I’m prone to word play such as calling myself a “huge” fan of “tiny” nanopores for DNA sequencing, about which I’ve previously opined. They will also recall that I’m an admitted scientific admirer of George Church, who I think is The Most Interesting Scientist in the World.

Having said this, it’s not surprising that I closely follow what’s trending in nanopore sequencing, and also make an attempt to read all of Church’s papers as they get published because they are almost invariably quite interesting, involve “big ideas,” and in some new way are very educational, at least for me. Following are my comments about a recently published paper on nanopore sequencing in venerable Proceedings of the National Academy of Sciences of the United States of America (aka PNAS) wherein Church is the designated corresponding author.

Backstory

The seminal origins and early history of nanopore sequencing have been recently chronicled and criticized—then clarified—in Nature Biotech in several “To the Editor” items, which collectively provide enlightening insights into who did what when, so to speak. Those of us who are ‘Nanoporati’—a clever term tweeted by Nick Lowman—should definitely read those Nature Biotech items. For now, however, I’ll set the stage, as it were, by echoing a bit of what I’ve posted in the past for nanopores.

Patented but prophetic (i.e. no data) methods for nanopore sequencing DNA is actually a relatively old (~20 year) idea posited by Church and other creative visionaries. On the other hand, nanopore sequencing was first reduced to practice commercially not too long ago by Oxford Nanopore Technologies (ONT). Many years of delay between concept and commercialization was due to the need for gradual evolution of lots of “nanopore-ology” and sequencing biochemistry, as well as developing highly sophisticated electronics and complex algorithms for data analysis.

Nanopore Sequencing-by-Scanning (Seq-by-Scan)

Taken from rsc.org

As depicted below, and as can be best seen in a video, ONT’s commercially available MinION Seq-by-Scan system essentially involves threading a strand of DNA through a protein-based nanopore and converting resultant ionic current fluctuations into nucleotide base sequence.

While there are issues with base-calling accuracy, the remarkably small and readily portable MinION provides fast, real-time sequencing results for a wide variety of applications. These included unique or otherwise compelling Point-of-Care analyses, such as pathogen surveillance, which has been achieved in remote geographical locations and even in outer space aboard the International Space Station, as I’ve previously posted.

Nanopore Seq-by-Syn

In contrast to DNA Seq-by-Scan using a nanopore, which is challenged by pore-based differentiation of similarly sized A, G, C, and T bases, DNA Seq-by-Syn has no such limitation as it uses the DNA as a template for base-by-base (i.e. stepwise) detection of enzymatic synthesis of complementary DNA. Various Seq-by-Syn methods and challenges have been discussed elsewhere, and currently available commercial systems include those from Illumina and PacBio. The former employs nucleotides that are reversible terminators equipped with cleavable fluorescent “tags” on each base. The latter detects fluorescently labeled tags on polyphosphates released upon nucleotide incorporation.

The presently featured DNA Seq-by-Syn publication by Stranges et al., which builds upon two earlier reports cited therein, differs from the above approaches by using nanopore-based detection of mass tags rather than fluorescent tags. In principle, mass tags could afford higher accuracy compared to DNA Seq-by-Scan. However, as will now be explained, achieving improved accuracy is far easier said than done.

The general approach taken to demonstrate proof-of-concept for mass-tagged nanopore DNA Seq-by-Syn is depicted below in simplified cartoon form, but involves a true tour de force—in my opinion—of three key technologies. The first is design and synthesis of the nucleotides with appropriate mass tags, which involves very sophisticated chemistry that is best appreciated by reading detailed, extensive supporting information (SI) for Stranges et al. and SI for an earlier publication by Fuller et al. In a nutshell, these nucleotides have 5’-hexaphosphates linked to relatively large mass tags comprised of complex oligonucleotide structures.

Taken from Stranges et al. PNAS 2016

The second area of technical innovation involves attachment of a single molecule of ϕ29 DNA polymerase to each α-hemolysin (αHL) nanopore in such a manner as to retain its enzyme activity and be positioned such that every released mass tag transits through (i.e., is “captured” by) the nanopore leading to base identification by its current signature. As depicted below in two related representations, each of these heteroheptameric pores is comprised of one modified αHL subunit to which a peptidyl SpyTag moiety is attached, and six unmodified αHL subunits. This allows attachment of one ϕ29 DNA molecule modified with a cognate peptidyl SpyCatcher moiety at a predetermined, time-average distance from the pore.

Taken from Stranges et al. PNAS 2016.

The third key area of innovation deals with insertion of the enzyme-pore conjugate into a lipid bilayer residing on a silanized array (aka chip) of 256 Ag/AgCl electrodes such that there is one functional pore per electrode. Interested readers are encouraged to consult the publication for details, as well as check out related fabrication and methods patents that I found by searching Google Scholar.

Representative Results

The first image shown above depicts what base tag-specific detection would ideally look like if each of the four different bases would have a characteristic current-blockage intensity and persistence. In addition, all pores would ideally function similarly. Not surprisingly, given the stochastic nature of single-molecule systems in general, Stranges et al. found less than ideal behavior.

For example, out of 70 single pores obtained, 25 captured two or more tags, whereas only six of those pores showed detectable captures of all four tagged nucleotides. Data obtained for the pore with the most transitions between tag capture levels (i.e. the best results) is shown below, while results for the other five are given in the SI.

Taken from Stranges et al. PNAS 2016

To quote the authors:

“All four characteristic current levels for the tags and transitions between them can be readily distinguished…Homopolymer sequences in the template, and repeated, high-frequency tag capture events of the same nucleotide in the raw sequencing reads were considered a single base for sequence alignment. We recognized 12 clear sequence transitions in a 20-s period. Out of the 12 base transitions observed in the data, 85% match the template strand, showing that this method can produce results that closely align to the template sequence.” 

Interested readers need to consult and carefully read the SI for Stranges et al. regarding the interpretation of the “repeated, high frequency capture events,” such as that exhibited by C in the above current vs. time plot.

All of the above snippets in aggregate suggest to me that, while this huge amount of work has made progress toward one approach to Seq-by-Syn, many improvements will need to be made before achieving a robust system to successfully compete in the commercial sector.

Authorship, Affiliations, and Acknowledgments

The relatively large team of 30 coauthors listed for Stranges et al. include the following numbers of investigators and affiliations: 1 at Arizona State Univ., 4 at Harvard, 11 at Columbia University, and 14 at Genia Technologies, which is a Santa Clara, CA company that was acquired by Roche in 2014, and is part of Roche Sequencing.

Acknowledgments in Stranges et al. refer to support by Genia and NIH Grant R01 HG007415, which I found was awarded to coauthors George M. Church (Harvard), Jingyue Ju (Columbia), and James J. Russo (Columbia). The end of the abstract of this grant reads as follows:

“The nanopore chips will be enhanced and expanded from the current 260 nanopores to over 125,000 using advanced nanofabrication techniques. We will conduct real-time single molecule Nano-SBS on DNA templates with known sequences to test and optimize the overall system. These research and development efforts will lay the foundation for the production of a commercial single molecule electronic DNA sequencing platform, which will enable routine use of sequencing for medical diagnostics and personalized medicine.”

The conflict of interest statement in Stranges et al. indicates that the technology described therein (called “Nanopore SBS”) has been exclusively licensed by Genia, and that specified coauthors are entitled to royalties through this license. In addition, Church is a member of the Scientific Advisory Board of Genia.

Parting Comments

Long gone are the days when government-funded academic researchers thumbed their noses, if you will, at commercial development. Nowadays almost all academics parlay their government grants into university patents that get licensed to companies, usually with some type of corporate involvement of said academics.

I hasten to add that I’m not implying that NIH-funded academic research being a “seed” for corporate profitability is negative—especially in view of its Small Business Innovative Research (SBIR) program—but rather view it as a paradigm shift for the better, as it allows academic creativity to be harnessed into applications that can hopefully greatly benefit society.

In conclusion, and coming back to George Church, who I highlighted in the introduction to this blog, I must say that he might very well be the academic researcher with the longest list of technology transfer, advisory roles, and founded companies—13 to date—according to a public list that is truly mind boggling, at least to me.

As usual, your comments are welcomed.

Postscript

After writing this blog, Roche announced on December 15, 2016 that “it has officially notified Pacific Bioscience (PacBio) of its intention to terminate its [2013] agreement and efforts to develop a sequencing instrument for use in the clinical research and clinical market using their Single Molecule, Real-Time (SMRT®) technology,” about which I have commented previously. The announcement went on to say Roche would instead focus on internal development efforts” and “actively pursue multiple technologies and commercial strategies.” A GenomeWeb headline was more specific:  “Roche Will Focus on Genia’s Nanopore Technology for Dx Market After Ending Deal With PacBio.”

On December 30, 2016 it was reported that the University of California (UC) filed a patent suit against the Chief Technology Officer (CTO) at Genia, and Genia Technologies, claiming the CTO produced key inventions during his time at UC that he later assigned to Genia, but which should have automatically been assigned to UC. Stay tuned…

Top 10 Innovations 2016

  • Sequencing with Sequel, but with a Shocking Surprise
  • Endonuclease CRISPR-Cas9 Makes the Cut Twice—Pun Intended
  • My Predicted 2016 Innovation Winners Made Lots of News

Welcome to my first blog of the New Year, 2017! My New Year’s resolution is to continue to do my best in providing interesting and informative content about what’s trending in nucleic acid research. As in the past, this first blog of the year comments on the Top 10 Innovations in 2016 that were selected by a panel of judges and published last month in The Scientist. Also like the past, you can peruse TriLink’s top products by clicking here.

So, with an imaginary loud flourish of trumpets, read on to learn about the 10 winners, starting with 1st place.

  1. Milo from ProteinSimple enables single-cell Western blotting in a benchtop instrument that allows researchers to search for specific proteins in about 1,000 single cells simultaneously. A titered cell-suspension is pipetted on a 1-by-3-inch glass microscope slide covered by a 30-micron gel-layer dotted with 6,400 microwells. Some wells will remain empty, but ~1,000 will collect individual cells for antibody-based protein analysis, following lysis and other steps, all done automatically. Indeed simple!
  2. ExVive Human Kidney Tissue from Organovo is a replica of the kidney proximal tubule created using 3D bioprinting, which is like uber-trending 3D printing with plastic, but instead uses tiny aggregates of cells. This novel product offers drug developers a reliable means of testing for renal toxicity that is more predictive than traditional cell culture, and avoids animal testing.
  3. Sequel System from Pacific Biosystems is not neither small (see below) nor inexpensive ($350,000), but is nevertheless a third the size and weight—and half the cost—of PacBio’s original long-read, single molecule, real-time (SMRT) sequencer, PacBio RS II, about which I’ve favorably blogged several times in the past. Moreover, Sequel has seven-times the throughput of PacBio RS II.

Taken from fiercebiotech.com

Development of what would become Sequel was announced by PacBio in 2013 as part of a potentially $75M deal with Roche Diagnostics aimed at DNA sequencing-based products for clinical diagnostics. Surprisingly—if not shockingly—in December 2016, after Sequel was launched, Roche stated it was terminating this deal with PacBio in order to focus on its internal development efforts.

At this time, I can only speculate that Roche’s internal efforts include single-molecule DNA sequencing using nanopore technology developed by Genia Technologies, Inc., which Roche acquired in 2014 for $125M—plus even greater contingent payments. By remarkable coincidence, I had been drafting a blog about Genia from a purely technical perspective, but will now update that for posting on January 24th as a sequel to Sequel—pun intended.

  1. Lumos from Axion Biosystems is a 48-well light-delivery device allowing researchers to incorporate cutting-edge optical assay techniques, such as optogenetics, into their in vitro research. Lumos delivers user-specified intensity and duration of light with up to four wavelengths simultaneously for assay flexibility.
  2. LentiArray CRISPR Libraries from Thermo Fisher Scientific make applying lentivirus-encoded CRISPR for gene editing more accessible to researchers. Given the continuing explosive-like interest in DNA endonuclease CRISPR-Cas9, which in 2015 also made the Top 10 cut—pun intended—I was expecting to find this endonuclease system among the 2016 Top 10, too. What surprised me, however, was seeing this system split into two parts—pun intended—i.e. CRISPR per se and separately as Cas9, which you’ll find below.
  3. nCounter Vantage 3D Panels from NanoString utilize an automated microscope that counts color-coded barcodes conjugated to target molecules that are mRNAs, DNAs, proteins, and even phosphorylation status of proteins—all at the same time, which I think is quite a technical achievement. The nCounter analysis system ranges from $149,000 to $280,000, and the nCounter Vantage 3D Panels run from $275 per sample and upward.

Taken from nanostring.com

  1. ZipChip from 908 Devices is a cleverly designed microfluidic chip that radically speeds up sample prep for mass spectrometry, requires only a few microliters of sample volumes, and broadens the range of materials that a mass spectrometer can handle. In less than 3 minutes per sample, ZipChip can process cell growth media, cell lysates, blood, plasma, or urine. The 1-by-3-inch chip is in a small box, less than a foot long, which mounts directly onto a mass spec. The device costs $30,000 and an auto sampler adds another $20,000 to the price.
  2. HAP1 Cells from Horizon have earned a spot in The Scientist Top 10 Innovations for the third year in a row—this time as Turbo GFP Tagged HAP1 Cells, which were selected for their ability to tag proteins of interest with green fluorescent protein (GFP) without requiring that the gene be overexpressed. Turbo GFP cells, which are custom-made using CRISPR-Cas9 gene-editing technology, cost $3,400 and take about 16-18 weeks to develop.

Roger Y. Tsien (February 1, 1952 – August 24, 2016).
Taken from wikipedia.org

As a sad side note, Roger Y. Tsien, who was awarded the 2008 Nobel Prize in Chemistry for his discovery and development of GFP, in collaboration with Osamu Shimomura and Martin Chalfie, passed away in 2016 at the age of 64.

  1. Prime sCMOS Camera from Photometrics is a 4.2-megapixel camera that has a built-in algorithm to reduce what is called “shot noise”—the variation inherent in measurements taken using light microscopes—without having to acquire many extra images and then average across them, or increase the light intensity, which can damage samples. The Prime sCMOS camera costs $15,950.
  2. GeneArt Platinum Cas9 Nuclease from Thermo Fisher Scientific is a recombinant Streptococcus pyogenes Cas9 protein purified from E. coli that contains a nuclear localization signal that aids in delivery to target-cell nuclei, where Cas9 works in conjunction with CRISPR. I should add that CRISPR-based gene editing is alternatively achieved by transfection of Cas9 mRNA, which is offered by TriLink, and used as described in a recent exemplary publication by a team of international collaborators.

Revisiting Jerry’s Predictions for 2016 Top 10 Innovations

Devotees of my blog may recall the following predictions I offered in January 2016 as winning innovations-to-be:

  • Direct Genomics in China will resurrect and morph Helicos single-molecule sequencing into a diagnostics instrument.
  • GnuBIO—acquired by Bio-Rad—will offer its long-delayed next-generation sequencing system.

While none of these were selected by The Scientist, they did make news in various ways:

  • Oxford Nanopore’s VolTRAX system for automated sample has very recently been launched. I should also note that its tiny MinION sequencer was rocket-launched—literally—to the International Space Station (ISS) for evaluation in rapid identification of pathogens that might infect astronauts on the ISS, as I’ve commented on in a previous blog. Launch puns intended.
  • Direct Genomics recently announced its GenoCare Analyzer, the world’s first single-molecule genome sequencer that is being engineered exclusively for the clinic, and promises to improve the cost, speed, and quality of clinical genome sequencing by directly reading a patient’s original DNA or RNA molecules without prior amplification.
  • GnuBio now offers a fully integrated sequencing platform which allows users to simply load genomic DNA onto cartridge, place the cartridge into the GnuBIO sequencer, and then press “run.” Within hours, results can be exported directly from the instrument with real-time informatics onboard.

Although my picks weren’t among those in The Scientist list for 2016, I take satisfaction in believing that choosing winning biotechnology products is like art appreciation or judging beauty, both of which are in the eye of the beholder, who can disagree on what their eyes behold.

As usual, your comments are welcomed.

New CRISPR System Reported for Targeting RNA Instead of DNA

  • Current “CRISPR Craze” for DNA Editing is Catalyzing Creativity
  • Early CRISPR Innovator Feng Zhang Now Reports Targeting RNA
  • This New “C2c2” System has Specificity Issues But is Nevertheless Promising

Just when you thought that the “CRISPR craze” would soon transition from the fundamental discovery phase to the improvements phase, something entirely new for CRISPR has come along. That something, recently published by Feng Zhang and others in venerable Science magazine, targets RNA instead of DNA. Consequently, this may lead to transient vs. permanent editing, as well as other RNA- vs. DNA-based applications.

Before further commenting on this exciting new RNA-targeting approach using CRISPR, here are a few snippets about the original DNA version of CRISPR to set the stage, and substantiate my tongue-in-cheek referral to the craze about it.

Backstory

Taken from igtrcn.org

Taken from igtrcn.org

Editing with CRISPR, which is short-form for CRISPR-Cas9, uses sequence-specific guide RNA (gRNA) to target DNA for cutting by Cas9 nuclease, as depicted below. Guide RNA and Cas9 can be introduced into cells either encoded in a vector or as synthetic gRNA and synthetic Cas9 mRNA, which TriLink offers in either wild-type or base-modified forms of Cas9 mRNA. CRISPR for genome editing was publicly described in Science in 2012 by co-corresponding authors Jennifer A. Doudna, a biologist at the University of California, Berkeley, and the French microbiologist Emmanuelle Charpentier. But Feng Zhang, at the Broad Institute, was first to obtain a patent on the technique.

Not surprisingly, given the financial potential for DNA editing by CRISPR, which has been called the ‘biotech discovery of the century,’ there is ownership litigation. This dispute is getting rather ugly, if you will, according to an article in Science titled Accusations of errors and deception fly in CRISPR patent fight.

crispPotential financial gain aside, PubMed stats I found clearly substantiate the craze factor in numeric terms: >4,000 publications to date with a rapidly increasing trajectory, i.e. ~600 in 2014 and ~1,200 in 2015, which is an average of roughly 4 publications every day in that year!

By the way, in the chart above that I made for CRISPR publications in PubMed, there was only one report in 2002, which was the first publication to identify CRISPR. These Dutch investigators used computer analysis to find a novel family of repetitive DNA sequences that is present among both domains of the prokaryotes (Archaea and Bacteria), but absent from eukaryotes or viruses. They noted that “[t]his family is characterized by direct repeats, varying in size from 21 to 37 bp, interspaced by similarly sized non-repetitive sequences. To appreciate their characteristic structure, we will refer to this family as the clustered regularly interspaced short palindromic repeats (CRISPR).” 

Taken from youtube.com

Taken from youtube.com

CRISPR was also selected as 2015 Science Breakthrough of the Year, and is featured in an interesting YouTube video that is definitely worth watching, in my opinion.

Enough said for CRISPR editing of DNA, let’s move on to RNA editing with CRISPR that offers a fundamentally different editing approach: whereas DNA editing makes permanent changes to the genome of a cell, CRISPR-based RNA-targeting approach may allow researchers to make temporary changes. Moreover, this can be adjusted up or down, and may one day provide greater specificity and functionality than existing methods for RNA interference (RNAi) using either siRNA or antisense oligos.

CRISPR Targeting RNA

Feng Zhang. Taken from mit.news.edu

Feng Zhang. Taken from mit.news.edu

At the risk of seeming to be too trendy, this section heading could have read “Feng Zhang 2.0” in that Zhang at the Broad Institute, along with co-corresponding author Eugene V. Koonin at NIH and uber-famous “Broadster” Eric Langer plus others on a large team, have characterized a new CRISPR system that targets RNA—but not DNA. In their recent Science publication they demonstrated that this new system involves a Class 2 type VI-A CRISPR-Cas effector—aptly abbreviated C2c2 (pronounced “see too, see too”)—that has RNA-guided RNase function.

The researchers originally identified C2c2 in the bacterium Leptotrichia shahii (L. shahii) in a systematic search for previously unidentified CRISPR systems within diverse bacterial genomes. They focused on C2c2 because its sequence contained two copies of a domain called higher eukaryotes and prokaryotes nucleotide-binding (HEPN) that has only been found in RNases. Mutating the putative catalytic site within either of C2c2’s HEPN domains demonstrated that none of the mutated enzyme versions could cut RNA in vitro, suggesting that both HEPN domains are necessary for C2c2 to work.

CRISPR-C2c2 from L. shahii reconstituted in E. coli to mediate interference of the RNA phage MS2 via crRNA facilitated by the two HEPN nuclease domains. Taken from Abudayyeh et al.

CRISPR-C2c2 from L. shahii reconstituted in E. coli to mediate interference of the RNA phage MS2 via crRNA facilitated by the two HEPN nuclease domains. Taken from Abudayyeh et al.

But C2c2 Cleaves Collateral RNA—a Case for “Lemons into Lemonade”?

Unlike Cas9, which cuts DNA only within the sequence dictated by the CRISPR gRNA, C2c2 was found to make cuts within the target sequence and adjacent, nonspecific sequences. While this collateral cleavage obviously presents a specificity problem, Zhang and his colleagues were able to create a deactivated C2c2 (dC2c2) variant by alanine substitution of any of the four predicted HEPN domain catalytic residues. To me this clever trick is like converting “lemons into lemonade” in that undesired non-specific cleavage is transformed into a programmable RNA-binding protein having potential utility.

For example, the investigators speculate that the ability of dC2c2 to bind to specified sequences could be used in the following ways:

  • Bring effector modules to specific transcripts in order to modulate their function or translation, which could be used for large-scale screening, construction of synthetic regulatory circuits, and other purposes.
  • Fluorescently tag specific RNAs in order to visualize their trafficking and/or localization.
  • Alter RNA localization through domains with affinity for specific subcellular compartments.
  • Capture specific transcripts through direct pull-down of dC2c2 in order to enrich for proximal molecular partners including RNAs and proteins.

Listen to Zhang’s Grad Students 

While the details of this seminal work published in Science is not easily summarized, its practical implications have been concisely translated, if you will, by first coauthor Omar O. Abudayyeh, and second coauthor Jonathan S. Gootenberg—both graduate student members of the Zhang lab—in three short videos that I encourage you to watch at this link.

Left: Omar Abudayyeh. Taken from zlab.mit.edu Right: Jonathan Gootenberg Taken from zlab.mit.edu

Left: Omar Abudayyeh. Taken from zlab.mit.edu Right: Jonathan Gootenberg Taken from zlab.mit.edu

Publication protocol generally lists coauthors in order of contribution, so in this C2c2 publication that has many coauthors, these fellows know what they’re talking about because they did lots of the lab work. Congrats to them, Feng Zhang (again), and all of the other contributors.

As always, your comments are welcomed and encouraged.

RNA World Revisited

  • Scripps Researchers ‘Evolve’ an RNA-Amplifying RNA Polymerase 
  • It’s Used for First Ever All-RNA Amplification Called “riboPCR”
  • TriLink Reagent Plays a Role in this Remarkably Selective in Vitro Evolution Method 
Prof. Gerald Joyce & Dr. David Horning. Photo by Madeline McCurry-Schmidt. Taken from scripps.edu

Prof. Gerald Joyce & Dr. David Horning. Photo by Madeline McCurry-Schmidt. Taken from scripps.edu

Those of you who regularly read my blog will recall an earlier posting on “the RNA World,” which was envisioned by Prof. Walter Gilbert in the 1980s as a prebiotic place billions of years ago when life began without DNA. That post recommended reading more about this intriguing hypothesis by consulting a lengthy review by Prof. Gerald Joyce. Now, Prof. Joyce and postdoc David Horning have advanced the hypothesis one step further by reporting the first ever amplification of RNA by an in vitro-selected RNA polymerase, thus providing significant supportive evidence for the RNA World. Following are their key findings, which were enabled in part by a TriLink reagent—read on to find out which one and how!

In Vitro Evolution of an RNA Polymerase

Horning & Joyce designed an in vitro selection method to chemically “evolve” an RNA polymerase capable of copying a relatively long RNA template with relatively high fidelity. The double emphasis on “relatively” takes into account that the RNA World would have many millions of years to evolve functionally better RNA polymerases capable of copying increasingly longer RNA templates with increasingly higher fidelity.

As depicted below, they started with a synthetic, highly structured ribozyme (black) wherein random mutations were introduced throughout the molecule at a frequency of 10% per nucleotide position to generate a population of 1014 (100,000,000,000,000) distinct variants to initiate the in vitro evolution process. Step 1 involved 5’-5’ click-mediated 1,2,3-trazole (Ø) attachment of an 11-nt RNA primer (magenta) partially complementary to a synthetic 41-nt RNA template (brown) encoding an aptamer that binds guanosine triphosphate (GTP). In Steps 2 and 3, the primer hybridizes to template and is extended by polymerization of A, G, C and U triphosphates (cyan).

Taken from Horning & Joyce, Proc. Natl. Acad. Sci., 2016

Taken from Horning & Joyce, Proc. Natl. Acad. Sci., 2016

GTP aptamer showing red and cyan sequences corresponding to above cartoon. Taken from Horning & Joyce, Proc. Natl. Acad. Sci., 2016

GTP aptamer showing red and cyan sequences corresponding to above cartoon. Taken from Horning & Joyce, Proc. Natl. Acad. Sci., 2016

Step 4 involves binding of aptameric structures to immobilized GTP (green), then photocleavage of the 1,2,3-triazole linkage in Step 5, followed by reverse transcription to cDNA and conventional PCR in Step 6 for transcription into ribozymes in Step 7. Twenty-four rounds of this evolution by selection were carried out, progressively increasing the stringency by increasing the length of RNA to be synthesized by decreasing the time allowed for polymerization. By the 24th round, the population could readily complete the GTP aptamer shown below. Subsequent cloning, sequencing and screening were then used to characterize the most active polymerase, which was designated “24-3.”

The TriLink “Connection”

2'-Azido-dUTP (aka 2'-azido-UTP)

2′-Azido-dUTP (aka 2′-azido-UTP)

The aforementioned in vitro evolution process actually involves tons of experimental details that interested readers will need to consult in the published paper, which is accompanied by an extensive Supporting Information section. In the latter, a subsection titled Primer Extension Reaction describes 3’ biotinylation of the template RNA strand (brown in above scheme) using TriLink “2’-azido-UTP” (more properly named 2’-azido-dUTP) and yeast poly(A) polymerase, followed by click connection of the RNA template’s 3’-terminal 2’-azido moiety to biotin-alkyne. This very clever functionalization of the RNA template strand allowed for subsequent capture of the double-stranded primer extension reaction products on streptavidin-coated beads, followed by elution of the desired nonbiotinylated strand for GTP aptamer selection (Step 4 above).

Properties of RNA Polymerase 24-3

Needless to say—but I will—enzymologists and RNA aficionados will undoubtedly be interested in musing over the kinetic and fidelity properties of RNA polymerase 24-3.

The rate of 24-3 polymerase catalyzed addition to a template-bound primer was measured using an 11-nt template that is cited extensively in the literature to evaluate various ribozymes. It was found that the average rate of primer extension by 24-3 is 1.2 nt/min, which is ∼100-fold faster than that of the starting ribozyme polymerase randomly mutagenized for in vitro selection.

The NTP incorporation fidelities of the starting and 24-3 ribozyme polymerases on this 11-nt test template, at comparable yields of product, are 96.6% and 92.0%, respectively. Horning & Joyce noted that the higher error rate of 24-3 is due primarily to an increased tendency for G•U wobble pairing.

Phenylalanyl tRNA. Taken from Horning & Joyce, Proc. Natl. Acad. Sci., 2016

Phenylalanyl tRNA. Taken from Horning & Joyce, Proc. Natl. Acad. Sci., 2016

Other longer RNA templates having various base compositions or intramolecular structures were also studied, with the stated “final test of polymerase generality” being use of 24-3 to synthesize yeast phenylalanyl tRNA from a 15-nt primer (in red right). The authors humorously describe the results as follows:

“Despite the stable and complex structure of the template, full-length tRNA was obtained in 0.07% yield after 72 h. This RNA product is close to the limit of what can be achieved with the polymerase, but is likely the first time a tRNA molecule has been synthesized by a ribozyme since the end of the RNA world, nearly four billion years ago.”

Exponential Amplification of RNA

PCR is the most widely used method for amplifying nucleic acids, and involves repeated cycles of heat denaturation and primer extension. The 24-3 RNA polymerase was used to carry out PCR-like amplification, but in an all-RNA system (named riboPCR by Horning & Joyce) using A, G, C, and U triphosphates and a 24-nt RNA template composed of two 10-nt primer-binding sites flanking the sequence AGAG. Somewhat special conditions were employed:

  • The concentration of Mg2+ was reduced to minimize spontaneous RNA cleavage
  • PEG8000 was used as a “molecular crowding” agent to improve ribozyme activity at the reduced Mg2+ concentration
  • Tetrapropylammonium chloride was added to lower the melting temperature of the duplex RNA

Under these conditions, 1 nM of the 24-nt RNA template was driven through >40 repeated thermal cycles, resulting in 98 nM newly synthesized template and 106 nM of its complement, corresponding to 100-fold amplification. Sequencing of the amplified products revealed that the central AGAG sequence was largely preserved, albeit with a propensity to mutate the third position from A to G, reflecting the low barrier to wobble pairing.

Amplification of a 20-nt template (without the central insert) was monitored in real time, using FRET from fluorescently labeled primers, and input template concentrations ranging from 10 nM to 1 pM. The resulting amplification profiles shown in the paper are typical for real-time PCR, shifted by a constant number of cycles per log-change in starting template concentration. A plot of cycle-to-threshold vs. logarithm of template concentration, also shown in the paper, was linear across the entire range of dilutions indicating exponential amplification of the template RNA with a per-cycle amplification efficiency of 1.3-fold.

Implications for the Ancient RNA World

It would be an injustice to Horning & Joyce if I would try to paraphrase their concluding discussion of this investigation, so here is what they say:

The vestiges of the late RNA world appear to be shared by all extant life on Earth, most notably in the catalytic center of the ribosome, but most features of RNA-based life likely were lost in the Archaean era. Whatever forms of RNA life existed, they must have had the ability to replicate genetic information and express it as functional molecules. The 24-3 polymerase is the first known ribozyme that is able to amplify RNA and to synthesize complex functional RNAs. To achieve fully autonomous RNA replication, these two activities must be combined and further improved to provide a polymerase ribozyme that can replicate itself and other ribozymes of similar complexity. Such a system could, under appropriate conditions, be capable of self-sustained Darwinian evolution and would constitute a synthetic form of RNA life.

Applications for Today’s World of Biotechnology

The aforementioned report by Horning & Joyce has received wide acclaim in the scientific press and world-wide public media as supporting the existence of a prebiotic RNA World, billions of years ago, from which life on Earth evolved.

While the academic part of my brain, if you will, fully appreciates the significance of these new insights on “living” RNA eons ago, the technical applications part of my brain is more piqued by possible practical uses of all-RNA copying or all-RNA riboPCR.

I, for one, plan to muse over possible applications of such all-RNA systems in today’s world of biotechnology, and hope that you do too, and are willing to share any ideas as comments here.

Virtual Reality for Graphene Nanopores and Space Station Sequencing 

  • Simulated Sequencing Takes Virtual Reality Way Beyond Games  
  • In Silico Simulations Suggest Possible 99.99% Accuracy for Graphene Nanopores 
  • minION Nanopore Sequencer is Sent to the International Space Station

Prelude

oculusvr

Taken from oculusvr.com

This blog is mostly about an international team of researchers who are using Virtual Reality (VR)—in the form of computational modeling—to simulate a new approach to DNA sequencing using nanopores made out of graphene. While VR is a hot trend in all sorts of so-called immersion media, such as those offered by Oculus (that was acquired by Facebook for $2 billion in 2014), computation-based VR has been used by scientists for simulating molecular interactions for a relatively long time. However, extending molecular simulations to complex (aka many-atom) systems like nanopores and DNA has had to wait for bigger, faster, cheaper computing.

In this blog, I’ll also discuss the recent launch of a commercially available nanopore sequencer for the first ever DNA sequencing in space using a self-landing rocket operated by Space X (co-founded by uber-famous multi-billionaire entrepreneur Elon Musk). It’s hard for me to even imagine what seemingly incongruent mix of topics could be more intriguing than these. As the now trendy saying goes, you can’t make this stuff up. But I digress…

Lift off! Taken from am1070theanswer.com                     Self-landing! Taken from indianexpress .com

Lift off! Taken from am1070theanswer.com    –  Self-landing! Taken from    indianexpress.com

Nanopore Sequencing 

baseFrom an earlier blog you’ll know that I’m a huge fan of tiny nanopores for sequencing, which is a 20+ year old concept, as depicted below from a seminal patent wherein DNA was envisaged as moving through a pore-in-lipid bilayer leading to base-dependent transient blockage of ionic current from which sequence is determined.

Taken from nature.com

Taken from nature.com

After two decades, this prophetic concept of nanopore sequencing has recently been realized, and commercialized by Oxford Nanopore Technologies (ONT) using “bionanopore” technology. Comparing the images above and below, you’ll see that bionanopores are, in many respects, quite similar to the first described nanopores, wherein a pore-forming protein, α-hemolysin (gray), is embedded in a lipid bilayer (blue). On the other hand, there is an attached DNA-processive enzyme, 29 DNA polymerase (brown), that feeds in the single strand of DNA for sequencing; details may be found elsewhere.

An alternative strategy for nanopore sequencing is to replace this type of bionanopore (composed of biological macromolecules) with a pore constructed of non-biological materials, notably silicon-based semiconductors that enable electrical signal generation and data processing. This would-be evolution of nanopore sequencing from biological constructs to various types of solid-state materials can be read about elsewhere.

Whither Goest Graphene?

It seems the next step in the progression of nanopore technology is those made of graphene—the trivial name for a very special form of carbon that was long known but exceedingly difficult to make. In fact, the process is so difficult that Andre Geim and Kostya Novoselov at The University of Manchester were awarded the 2010 Nobel Prize in Physics for their work enabling the production and characterization of graphene.

Geim and Novoselov. Taken from rsc.org 

Geim and Novoselov. Taken from rsc.org

Graphene is a two-dimensional array or “sheet” of carbon atoms that is usually depicted by the ball-and-stick model (pictured at the left below) as a one-atom-thick sheet of otherwise infinite dimensions. Since nothing is infinite in the real world, sheets of graphene have edges to which hydrogen is bonded, but for simplicity is ignored. This carbon-carbon bonding with carbon-hydrogen edges is akin to that in polycyclic aromatic hydrocarbons familiar to readers who are chemists.

Taken from 3dprint.com  

Taken from 3dprint.com

Taken from Bayley (2010) in nature.com

Taken from Bayley (2010) in nature.com

Because of graphene’s unique electrical properties and single-atom-thin structure, the basic idea is that a nanometer-size hole in graphene might be made—somehow—to allow DNA and ions to pass through and thus generate electrical signals—somehow—that are accurately deciphered—somehow—into DNA sequence. Oh, and let’s not forget that this sequence information must differentiate—somehow—3’->5’ from 5’->3’ directional pass through. All these “somehows” are meant to indicate that it’s far easier to imagine the concept of graphene nanopore sequencing, as fancifully shown below, than to actually do it.

Taken from Mechant et al. Nano Letters (2010)

Taken from Mechant et al. Nano Letters (2010)

The most daunting practical problem deals with how to “drill” tiny holes in graphene. One approach has been to use controlled electron-beam exposure in a transmission electron microscope. Initial demonstration of this approach was published in 2010 by Merchant et al. in Nano Letters in a paper titled DNA Translocation through Graphene Nanopores, from which the schematic left is taken.

In this device, a few-atoms-layer piece of graphene (1-5 nm thick) having an ~10 nm hole is suspended over a 1 μm diameter hole in a 40 nm thick silicon nitride (SiN) membrane suspended over an ~50 × 50 μm2 aperture in a silicon chip coated with a 5 μm silicon oxide (SiO2) layer in such a way that a bias voltage (VB) is applied between the reservoirs to drive DNA through the nanopore. Although DNA could be detected, the graphene pore size was too big to allow sequence detection.

Taken from Chang et al. Nano Letters (2010)

Taken from Chang et al. Nano Letters (2010)

Similar studies by Schneider et al. were also reported in Nano Letters in 2010, which appears to be a watershed year for this journal inasmuch as another noteworthy nano-detection scheme for DNA was described therein by Chang et al.—but with an important new feature. Namely, using gold electrodes (in yellow, below) separated by only 2 nm and conjugated to dC, a derivative of dG (blue balls) apparently was able to H-bond (magenta) to dC—based on dC-dG complementarity and detected as electron tunneling signals. This transient, base pair-specific H-bonding is what has now been further investigated by others albeit in the following form of Virtual Reality.

Virtual Reality Nanopore Sequencing

In contrast to the above “real” experiments, others have simulated reality using mathematical calculations based on theoretical chemistry, which is Virtual Reality that has physical significance well beyond simply playing games. Mathematical modeling or computation simulations are phrases generally used to describe these so-called in silico “experiments” that serve as indications of what could be done, in theory, if this Virtual Reality is actually translatable to the real world. But I digress…

An international team of investigators in the U.S., Germany, and Netherlands has recently reported studies titled Nucleobase-functionalized graphene nanoribbons for accurate high-speed DNA sequencing. Although this article is a dreaded “pay-to-read” article, there is a brief news piece about it at the website for the U.S. National Institute of Standards and Technology (NIST) where some of this work was conducted.

Taken from nist.gov

Taken from nist.gov

As is evident from the schematic shown below, these investigators borrowed from the aforementioned types of publications to imagine a graphene nanopore having its internal edges functionalized with nucleobase moieties that could potentially H-bond with DNA bases in a sequence specific manner—à la Chang et al. Under appropriate conditions, this could provide the basis for sequencing via measurement of induced current fluctuations.

More specifically, they imagined a sheet (aka ribbon) of graphene 4.5 x 5.5 nm with several nucleobase moieties attached to a 2.5 nm nanopore. In animated simulations (which are linked at the NIST website), you can watch how this sensing device would perform at room temperature in water with attached cytosine H-bonding to detect G in DNA.

When you watch this simulation, you’ll immediately notice how “wiggly” DNA is due to random motions of its constituent groups and atoms. You’ll also see detection of each translocating (i.e. passing G) as an increasing signal being recorded in real-time. While time as a parameter in this simulation is real, the simulation itself is not real, but rather virtual reality based on state-of-the art theoretical calculations by a computer.

With that caveat in mind, the performance was said to be 90% accurate (due to missed bases rather than wrongly detecting a base) at a rate of 66 million bases per seconds, which to me is mindboggling ultra-fast. Moreover, if this device could be fabricated as four sequentially-located graphene pores each functionalized with either C, G, A, or T, the researchers estimate that “proofreading” would increase accuracy to 99.99%, as required for sequencing the human genome.

Virtual reality is, well, not reality. And sometimes dreams and reality go in opposite directions. However, if indeed the above imaginary device and simulations were to become reality—nanopore sequencing would indeed be advanced dramatically from today’s performance.

ONT’s minION Sequencer in Space

In transitioning back to reality, it’s almost unbelievable to me that ONT’s minION nanopore sequencer—which I’ve blogged about before—was sent to the International Space Station (ISS) in April 2016 to carry out the first ever sequencing of DNA in space. If that wasn’t enough “buzz”, then the fact that this was achieved by uber-famous Elon Musk’s Space X company made it way more so, along with much ado in successfully landing the rocket’s first-stage on a relatively tiny platform in the ocean. This is all amazing stuff.  And to think that not so long ago, we were thinking how great it would be if self-landing rockets were really possible, not just a fun concept in video games and sci-fi movies!  Maybe virtual reality isn’t that far from becoming reality, but I digress….

The first aim of putting the minION nanopore into space is to demonstrate the feasibility of nanopore sequencing in microgravity. That being done would then allow use of the minION to rapidly sequence astronaut samples in the ISS to diagnose, for example, an infectious disease or other health issue.

Interested readers can peruse elsewhere much more about this historic milestone, as well as watch and listen to a short (but very exciting) video titled Space Station Live: Big DNA Science in a Small Package. The video was posted on Twitter on July 21st and features the minION device (aka The Biomolecular Sequencer).

NASA minION flight hardware for the ISS experiments packaged for shipment to the ISS. Credit: NASA/Sarah Castro. Taken from spaceref.com

NASA minION flight hardware for the ISS experiments packaged for shipment to the ISS. Credit: NASA/Sarah Castro. Taken from spaceref.com

I look forward to learning the results of the minIONs research in space. I hope that it’s “mission accomplished!” and of great use in years to come. BTW, if you’re a “space buff” like me, you can watch and listen to ISS-Mission Control live streaming 24-7 at this website.

As usual, your thoughts about this blog are welcomed as comments.