Phoenix in Mythology and Sequencing

  • Like a Phoenix, Helicos Sequencing is Being Reborn
  • Direct Genomics in China to Launch the Genocare Clinical Sequencer
  • SeqLL in the USA to Launch Benchtop tSMS Sequencer

A phoenix as depicted by F.J. Bertuch (1747–1822). Taken from Wikipedia.org

In ancient Greek mythology, a phoenix is a bird that is cyclically regenerated or reborn by arising from the ashes of its predecessor, which dies in a show of flames and combustion. In contrast to a phoenix, modern biotech methods generally “die” in utility by being displaced with faster, better, and/or cheaper methods rather than undergoing “rebirth” in the context of a new application. However, a method developed by a company named Helicos (scarily close to Helios associate with a phoenix) may prove to be a rare exception. Perhaps this is destiny, but I digress…

Helicos Sequencing

Successful Sanger sequencing of a human genome in the early 2000s spawned numerous efforts to develop faster, better, and/or cheaper methodology to enable genomic analysis on a routine basis. Among the early contenders there was Helicos BioSciences, which was founded in 2003 by several principals including then—and still—uber-famous Stephen Quake.

Helicos sequencing technology, which is depicted below and outlined elsewhere, was especially attractive because it was “true” single-molecule sequencing (i.e. sample prep did not require prior PCR or other amplification, thus greatly simplifying the workflow). Moreover, the technology uniquely allowed direct RNA sequencing, thus obviating the need to first convert RNA into cDNA.

Main steps for primer(P)-based, single-color (Cy3 dye) Helicos sequencing, in this example using two passes. Taken from Harris et al. Science (2008)

3’-Unblocked reversible terminator. Taken from Chen et al. (2013)

Details for how this sequencing-by-synthesis occurs can read in various proof-of-concept publications. However, it’s worth noting here that the 3’-unblocked reversible terminator nucleotide triphosphate monomers have a cleavable linker attached to a detectable dye. Helicos referred to these as “Virtual Terminator” nucleotides since they are efficiently incorporated by a polymerase yet block incorporation of a second nucleotide on a homopolymer template.

So, with these methodological advantages going for it, why did Helicos file for bankruptcy in 2012? Press coverage at that time stated ‘rough financial sledding and tough competition from rival next-generation sequencing companies.’ In my humble opinion, this lack of commercial success was primarily due to the HeliScope Genetic Analysis System (pictured below) being way too big (think upright freezer-refrigerator), far too expensive ($1,350,000), and its ~35-base reads too short on performance—pun intended.

Two Phoenix-Like Versions of Helicos Sequencing

Fast forwarding about five years from the 2012 bankruptcy filing by Helicos brings us to recent reports of two independent efforts to bring back Helicos sequencing in commercially viable formats and contexts, think Phoenix rising from the ashes.

Jiankui He and the GenoCare sequencer (credit Xinjie Tian). Taken from Cyranoski Nature Biotechnology (2016)

The first of these is led by Jiankui He, Founder/CEO of Direct Genomics in Shenzhen, Guangdong, China, as well as Associate Professor at South University of Science and Technology of China in Shenzhen. He, coincidentally, was a postdoc with Helicos cofounder Stephen Quake, who is reported to lead the scientific advisory board for this new company.

The company’s website homepage states the following:

“Direct Genomics is providing physicians with the first single molecule sequencer built exclusively for the clinic. The technology simplifies genome sequencing by reading individual DNA and RNA molecules directly from patient’s blood or tissue samples, which delivers significant improvement in cost and speed. Together with clinicians, Direct Genomics is making genetics an affordable part of everyday patient care.”

Perusal of scant technical information on the company’s website suggests to me that a smaller sized, TIRF-optics-enabled instrument running Helicos-type sequencing has been developed. A story about Direct Genomics by David Cyranoski in Nature Biotechnology states that $100 “clinical sequencing” is being targeted, with a blood-draw to report turnaround time of 20 hours. A very recent publication I found provides details for resequencing the Escherichia coli genome by the Direct genomics platform named GenoCare.

The company’s website lists the following clinical applications:

  • Non-invasive prenatal testing (NIPT)
  • Tumor diagnosis
  • Early-stage cancer prediction
  • Pre-implantation genetic diagnosis (PGD)

The second Phoenix-like rebirth of Helicos sequencing has been developed by SeqLL, which was co-founded in 2013 by William St. Laurent and Daniel Jones, who previously held various technical positions at Helicos. Statements and a video on SeqLL’s website indicate to me that the sequencing technology is essentially that originally developed and patented by Helicos, which is still trademarked as True Single Molecule Sequencing (tSMS™).

William St. Laurent. /   Daniel Jones. Taken seqll.com

SeqLL has been operating as a tSMS™ service provider, but in October 2016 announced the launch of the tSMS™ System Early Access Program giving researchers access to its new benchtop system “designed to deliver unparalleled quantitative RNA and specialty DNA sequencing results to both academic and industry research partners.” I should add that a big, strong bench is needed given that the physical specs are 30 x 30 x 60 inches and 1,000 pounds! Nevertheless, SeqLL recently announced an SBIG grant for improving its direct RNA sequencing technology, which I think could prove to be a driver for adoption.

In conclusion, I think it’s very interesting to see Helicos sequencing coming back to life, if you will, in not one but two different commercial contexts, both of which will hopefully be successful. This despite current ‘tough competition from rival next-generation sequencing companies,’ as observed in the 2012 bankruptcy story about Helicos mentioned above. First and foremost, among that competition is Oxford Nanopore, which I’ve blogged about previously, and whom offers single-molecule sequencing that seems to me to be faster, better, and cheaper for both DNA and RNA, directly.

As usual, your comments are welcomed.

Postscript

After this blog was written, it was reported in GenomeWeb that Direct Genomics plans to deliver 50 instruments this year to SinoTech Genomics, a startup based in Shanghai that offers both clinical and research sequencing services. Direct Genomics CEO Jiankui He is quoted as saying that ‘SinoTech Genomics [is] committed to ultimately purchasing 700 GenoCare platforms,’ and that Direct Genomics ‘has the capacity of producing around 1,000 GenoCare instruments per year,’ which would be very impressive based on past operational experience with manufacturing Sanger sequencers at ABI.

The piece went on to report that Direct Genomics also ‘aims to launch GenoCare in the US in September.’ Regarding what’s inside the box, so to speak, ABI veteran Bill Efcavitch, who previously served as chief technology officer of Helicos, is quoted as saying that ‘the main difference between the former Helicos technology and the GenoCare platform is in the hardware. It’s completely different engineering.’ He added, however, that it still uses Helicos’ virtual terminator chemistry.

Ocean ‘Dandruff’ DNA to Better Study Marine Biology

  • DNA Barcoding for all Organisms has Numerous Applications
  • DNA Barcodes from Water Samples Greatly Aide Marine Biologists
  • Aquatic Environmental DNA (eDNA) Proves to be Informative ‘Dandruff’

Human DNA identity analysis is now commonplace methodology that’s frequently featured in newspaper stories, TV crime series, or “who dun it” movies. The same principle (i.e. using a characteristic DNA pattern or signature) applies to identification of all animals, birds, insects, and microbes. Actually, DNA barcoding extends to any organism, whether it is alive or has been dead for hundreds of thousands of years (so long as it’s preserved by fossilization).

Taken from gajitz.com

Marine biologists face a serious challenge with accounting for very diverse forms of marine life that exists in a mindboggling huge volume of water. Consequently, it’s not surprising that analysis of water-borne, marine DNA barcodes—as proxies for going to and counting fish—is rapidly trending in utility and importance. Known formally as environmental DNA (eDNA), the aquatic version has been humorously referred to as ocean ‘dandruff’ by Christopher Jerde of the University of Nevada in Reno (which, ironically, is landlocked and distant from any ocean.) But I digress. Before diving further (pun intended) into ocean dandruff, let’s briefly review the background of DNA barcoding.

DNA Barcodes 101

Prof. Paul Herbert. Taken from uoguelph.ca

In 2003, Prof. Paul Herbert and coworkers in the Department of Zoology at the University of Guelph in Canada published a seminal study titled Barcoding animal life: cytochrome c oxidase subunit 1 (CO1) divergences among closely related species that fundamentally changed the field of taxonomy. In a nutshell, Herbert’s team showed it was feasible to classify millions of species based only on DNA sequence of the mitochondrial gene CO1. In the intervening, relatively short amount of time, there have been thousands of publications dealing with applications and extensions of this concept, which is now recognized to be very powerful and promising albeit with some limitations.

Typically, DNA barcodes are identified by sequencing after PCR amplification of one or more specific genetic loci such as CO1. Following proof that a DNA barcode can differentiate the species of interest, single- or multiplex quantitative PCR (qPCR) can be used to enumerate relative amounts of sample from the field.

The advent of high-throughput sequencing technologies applicable to complex mixtures of individually tagged samples then gave rise to “metabarcoding,” about which interested readers can consult many publications for specific topics.

Craig Venter steers his research yacht, Sorcerer II, under the Sydney Harbour Bridge in his quest to collect microbes from the world’s waters. Photo: Dallas Kilponen. Taken from smh.com.au

BTW, among the many pioneering scientific ventures by uber-famous Craig Venter, is his Global Ocean Sampling Expedition aboard his research yacht, Sorcerer II. The expedition is a quest to unlock the secrets of the oceans by sampling, sequencing and metabarcoding DNA of all (or most) microorganisms living in these waters.

Lest you think this was a well-intended but unproductive journey—some say junket—by Venter and coworkers, here’s a link to peruse 16 resultant publications that I found by searching PubMed. To watch and listen to Venter talk about this work, you can click here for an educational and entertaining—as usual with Venter—TED Talk on Sampling the Ocean’s DNA that’s had over 550,000 views!

Ocean ‘Dandruff’

Now that we’ve covered the basics of DNA barcoding and metabarcoding, let’s turn back to ocean dandruff. Dandruff, simply put, is dead skin cells. Using dandruff as an intended witty metaphor for ocean eDNA is a bit misleading as marine eDNA is comprised of a complex mixture of cellular matter from scales, feces, decomposing tissue, etc. of fish and all other present or past sea creatures. Consequently, the design and specificity of primers for PCR is of paramount importance for obtaining—let alone interpreting—DNA barcodes based on fragment size or sequence.

As reported by Miya et al., monitoring the occurrence of fish species-specific eDNA PCR fragments (~70–300 bp) has traditionally used conventional electrophoretic gel separation and detection. More recently, qPCR using fluorogenic probes has been employed owing to the method’s sensitivity, specificity and potential to quantify the target DNA. For example, it has been possible to accurately estimate the biomass of common carp in a natural freshwater lagoon using qPCR of eDNA concentrations and biomass in aquaria and experimental ponds.

Miya et al. also describe the development of a set of PCR primers for metabarcoding mitochondrial DNA of 880 species of fish. They sampled eDNA from four tanks with known species compositions, prepared dual-indexed libraries and performed paired-end sequencing. Out of the 180 marine fish species contained in the four tanks, they detected 168 species (93.3%) distributed across 59 families and 123 genera. That’s quite an impressive accomplishment.

Ocean Dandruff Case Studies

Since there are so many fish-related applications of DNA barcodes, I’ve selected several recent examples that are indicative of the utility of ocean ‘dandruff’—and are quite interesting, in my opinion. The first case in point exemplifies how eDNA can be used to deal with rare and endangered species, which are either very hard to find or can be dangerously distressed by catching to obtain samples.

Green SturgeonBergman et al. report that a decline in abundance of North American Green Sturgeon located in California’s Central Valley has led to its listing as Threatened under the Federal Endangered Species Act in 2006. While visual surveys of spawning by these Green Sturgeon are effective at monitoring fish densities in concentrated pool habitats, results do not scale well—pun intended. By contrast, eDNA provides a relatively quick, inexpensive tool to efficiently identify and monitor Green Sturgeon DNA.

Taken from mthsecology.wikispaces.com

These investigators concluded that follow-on work based on this first-ever eDNA study of Green Sturgeon has the potential to provide better knowledge of the spatial extent of Green Sturgeon spawning that could help identify previously unknown spawning habitats and discover factors influencing habitat usage, guiding future conservation efforts.

Monterey Bay—The second case study, by Port et al., involves taking stock of the marine mammals and fish in Monterey Bay using eDNA and, importantly, comparing the results obtained to those from traditional dive surveys.

In brief, this team of researchers from several universities and the Monterey Bay Aquarium Research Institute found that eDNA assessments picked up almost all the organisms scuba divers spied underwater—plus many more that human eyes missed. Here’s some detail on how they did this.

At each scuba survey location as well as at sites offshore, ~1 gallon of water was sampled several feet above the bottom. Four types of habitats were sampled: sea grass beds, Monterey Bay’s unique “Kelp Forest,” sandy areas and rocky reefs. Onshore, in a “clean” (DNA-free) lab, these water samples were filtered to collect cells containing eDNA for storage at −80 °C until eDNA extraction at a university clean lab. A vertebrate‐specific primer set targeting a small region of the mitochondrial DNA 12S rRNA gene was used for PCR followed by gel purification.

Researchers collecting water in Monterey Bay for eDNA analysis. Courtesy Jesse Port. Taken from mercurynews.com

After quantification, pooled amplicons (each having a sample index sequence) were paired-end sequenced on the Illumina MiSeq platform using a 20% PhiX spike‐in control to improve the quality of low‐diversity samples. The conclusions are worth quoting because—in my opinion—the findings represent a new era in marine biology based on nucleic acid analysis:

“We find spatial concordance between individual species’ eDNA and visual survey trends, and that eDNA is able to distinguish vertebrate community assemblages from habitats separated by as little as ~60 meters. eDNA reliably detected vertebrates with low false‐negative error rates (1/12 taxa) when compared to the surveys, and revealed cryptic species known to occupy the habitats but overlooked by visual methods. This study also presents an explicit accounting of false negatives and positives in metabarcoding data, which illustrate the influence of gene marker selection, replication, contamination, biases impacting eDNA count data and ecology of target species on eDNA detection rates in an open ecosystem.”

Restated more simply, eDNA analysis of the water picked up 11 of the 12 fish and marine mammals that the divers observed, and—importantly—identified 18 additional animals the divers missed! The efficiency and improvement offered by eDNA analysis compared to traditional seek-and-count methods has been echoed in an editorial I found by Hoffmann et al. titled, tongue-in-cheek, Aquatic biodiversity assessment for the lazy.

Invasive Gobies—The third and final case study deals with detection of invasive, non-native fish to assess whether eDNA can provide a better advanced warning system for detecting these unwanted creatures and implementing eradication steps.

Gobies are an invasive fish species that has colonized freshwaters and brackish waters in Europe and North America. One of them, the round goby (Neogobius melanostomus), pictured below, is among the worst invaders in Europe. Current methods to detect the presence of these gobies are labor intense and not very sensitive. Consequently, populations are usually detected only when they have reached high densities and when management or containment efforts are futile.

Taken from animal.memozee.com

To improve monitoring, Swiss and Canadian collaborators developed an assay based on the detection of eDNA in river water, without detecting any native fish species, which is obviously an important assay criterion. The eDNA assay requires less time, equipment, manpower, skills, and financial resources than conventional monitoring methods such as electrofishing, angling or diving. Samples can be taken by novices and the assay can be performed by any molecular biologist on a conventional PCR machine. Therefore, this assay enables environment managers to map invaded areas independently of fishermen’s reports and fish community monitoring.

I could go on and on with examples of utility and the many advantages provided by eDNA for marine biology, but I’m sure you get the picture. I hope that you agree with me that eDNA analysis is a very valuable type of trending nucleic acid-based methodology.

As usual, your thoughts or comments are welcomed.

SaveSave

SaveSave

SaveSave

SaveSave

Curiously Circular RNA (circRNA) Gets Curiouser

  • circRNA Molecules Have, Oddly, No Beginning or End
  • circRNA Are Now Recognized as Regulators of Gene Expression 
  • A Flurry of New Findings Indicate circRNA Are Also Templates for Synthesis of Proteins Having As Yet Unknown Functions

Electron micrograph of ~3,000-nt circRNA. Taken from Matsumoto et al. PNAS (1990).

About a year ago, my blog titled Curiously Circular RNA pointed out that circular RNA (circRNA) in animals are odd molecules in that, unlike the vast majority of other RNA in animals, circRNA have no structural beginning (5’) or end (3’). This very curious feature has, not surprisingly, stimulated considerable scientific interest in knowing more about these molecules, which were serendipitously discovered some 30 years ago.

Application of next-generation sequencing has revealed that circRNA are actually relatively abundant and evolutionarily conserved, which implicates biological importance rather than inconsequential mistakes during RNA splicing mechanisms. Some circRNA have been shown to have function—circRNA can hybridize to complementary microRNA (miRNA), and thus serve as a kind of ‘sponge’ that influences miRNA-based gene expression. Evidence for circRNA involvement in gene expression continues to grow, as there are now >700 items on “circRNA [and] sponges” in Google Scholar.

Very recently published lines of research (that I’ll outline in what follows) implicate circRNA as coding templates for proteins, which heretofore has been exclusively associated with messenger RNA (mRNA). Current dogma holds that translation of mRNA into protein requires recognition of the 7-methylguanylated (m7G) 5’-cap structure to start ribosome binding, while the 3’-poly(A) tail protects the mRNA molecule from enzymatic degradation and aids in stopping translation, as depicted below.

Taken from Shoemaker & Green Nature Structural & Molecular Biology (2012).

Start and stop structural elements characteristic of mRNA are obviously not present in circRNA, which are literally just circles of RNA. Consequently, finding proteins encoded by circRNA has stirred up controversy about whether such proteins are a new and fundamentally important aspect of genetics or just inconsequential biochemical mistakes.

Translation of circRNA in Fly Head Neurons

Fruit fly. Taken from turbosquid.com

Researchers at The Hebrew University of Jerusalem in Israel in collaboration with a team at Max-Delbruck-Center for Molecular Medicine in Berlin, Germany recently reported in Molecular Cell the first compelling evidence that a subset of circRNA is translated in vivo. The study by Kadener & coworkers was carried out using the common fruit fly (Drosophila melanogaster), which is known to have a number of features that lend to investigations of circRNA: (1) >2,500 fruit fly circular RNAs have been rigorously annotated, (2) these are mostly derive from back-splicing (pictured below) of protein-coding genes, (3) hundreds of which are conserved across multiple Drosophila species, and (4) exhibit commonalities to mammalian circRNA.

Direct back-splicing: a branch point in the 5’ intron attacks the splice donor of the 3’ intron. The 3’ splice donor then completes the back-splice by attacking the 5’ splice acceptor forming a circRNA. Taken from Jeck & Sharpless Nature Biotechnol (2014).

This study by Kadener & coworkers involves a plethora of technically complex experimental procedures and associated jargon, from which I’ve extracted what I believe to be some key points to share. After annotating the Drosophila circRNA open reading frames (cORFs), which, by definition,h have the potential for translation, they searched for evidence of their translation utilizing previously published ribosome footprinting (RFP). This led to identification of 37 circRNAs with at least one specific RFP read, referred to as ribo-circRNAs.

Taken from Jeck & Sharpless Nature Biotechnology (2014)

Several representative ribo-circRNAs were then constructed to each have (pictured below) a metallothionine (MT) promoter and V5 tag to facilitate translation and anti-V5 antibody-based detection of the expected protein after transfection into cells.

To determine whether circRNAs are translated in a more relevant tissue, they set up the RFP methodology in fly heads. A genetic locus named mbl that is known to produce a circRNA (circMbl3) at high abundance was selected for targeted mass spectrometry from a fly head immunoprecipitated MBL. They utilized synthetic peptides to determine characteristic spectra for which to search in the fly head immunoprecipitate and found a consistent and very high confidence hit for a peptide that can only be produced by circMbl3.

Kadener & coworkers extended these fly head findings to mammalian mouse and rat systems, but the most interesting part of this study—in my opinion—dealt with what signals ribosome binding and translation in the absence of the 5’ cap structure present in mRNA. They demonstrated circRNA translation under conditions intended to block normal 5’ cap-dependent translation of mRNA, and concluded that “[untranslated regions] of ribo-circRNAs (cUTRs) allow cap-independent translation [and that] further research is necessary to uncover how these sequences promote translation.”

Remarkably, as you’ll now read, another group of investigators have apparently found how such promotion of circRNA translation can occur.

Translation of circRNA is Driven by N6-Methyladenosine (m6A)

The most abundant modification of RNA in eukaryotes is m6A, which has been recently shown by Li et al. to recruit binding proteins that collectively facilitate the translation of specifically targeted mRNAs—i.e. those “marked” with m6A—through interactions with 40S and 60S ribosome subunit “machinery” that actually carry out translation. Contemporaneously, Yang et al. found that m6A likewise promotes efficient initiation of protein translation from circRNAs in human cells. They discovered that consensus m6A motifs are enriched in circRNAs, and a single m6A site is sufficient to drive translation initiation.

As depicted below, this m6A-driven translation requires initiation factor F4G2 and m6A “reader” YTHDF3. Experiments showed that this translation is enhanced by methyltransferase METTL3/14 and inhibited by demethylase FTO, which enzymatically “add” and “subtract” methyl (Me) groups on specific adenosines (A) in circRNAs, respectively.  It has also been shown to be upregulated upon heat shock, which is a commonly employed method to induce “stress” in cells.

Taken from Yang et al.

Further analyses through polysome profiling, computational prediction and mass spectrometry revealed that m6A-driven translation of circRNAs is widespread, with hundreds of endogenous circRNAs having translation potential. Yang et al. concluded by stating that their “study expands the coding landscape of [the] human transcriptome, and suggests a role of circRNA-derived proteins in cellular responses to environmental stress.”

Zinc Finger Protein in Muscle Cell Development

Finally, and essentially contemporaneously with above mentioned two publications, a third independent investigation reported by Legnini et al. demonstrated selective circRNA downregulation using short-interfering RNAs (siRNAs). These reagents for RNA interference (RNAi) were used in an image-based functional genetic screen of 25 circRNA species, conserved between mouse and human, expression of which are differentially expressed during myogenesis (i.e. formation of muscular tissue) in Duchenne muscular dystrophy myoblasts.

This siRNA/RNAi-based functional analysis provided one interesting case related to zinc finger protein 609 (circ-ZNF609)—a reported miRNA sponge—the phenotype of which could be specifically attributed to the circular form and not to the linear mRNA counterpart. Consistent with the circ-ZNF609 sequence having an ORF, they found that a fraction of circ-ZNF609 RNA is loaded onto polysomes and that, upon puromycin treatment, it shifted to lighter fractions, similar to mRNAs. The coding ability of this circRNA was proved through use of artificial constructs expressing circular tagged transcripts, and by CRISPR/Cas9—the trendy gene editing method about which I’ve already commented multiple times.

Despite all this evidence, Legnini et al. stated that they “have no hints on the molecular activity of the proteins derived from circ-ZNF609 and as to whether they contribute to modulate or control the activity of the counterpart deriving from the linear mRNA.”

In thinking about closing comments about this update in circRNA, I decided to emphasize that investigations in the field of RNA continue to reveal complexities that will require many more years of global attention to unravel and understand. In just the past decade or so we’ve learned about gene regulation by miRNA/siRNA, reclassification of “junk DNA” as encoding a myriad of long noncoding RNA (lncRNA), mRNA regulation by base-modifications, and curious circRNAs that are more than sponges, and likely encode hundreds (if not thousands) of proteins whose functions have yet to be elucidated. Amazing!

What are your thoughts about all of this?

Your comments are welcomed.

Postscript

After writing this blog, Panda et al. at the National Institute on Aging-Intramural Research Program, National Institutes of Health published a paper titled High-purity circular RNA isolation method (RPAD) reveals vast collection of intronic circRNAs. Here’s a snippet of the abstract which adds to the increasingly curious occurrence of circRNAs that begs, if you will, further research aimed at discovering functions of circRNA-derived proteins.

“Here, we describe a novel method for the isolation of highly pure circRNA populations involving RNase R treatment followed by Polyadenylation and poly(A)+ RNA Depletion (RPAD), which removes linear RNA to near completion. High-throughput sequencing of RNA prepared using RPAD from human cervical carcinoma HeLa cells and mouse C2C12 myoblasts led to two surprising discoveries: (i) many exonic circRNA (EcircRNA) isoforms share an identical backsplice sequence but have different body sizes and sequences, and (ii) thousands of novel intronic circular RNAs (IcircRNAs) are expressed in cells. In sum, isolating high-purity circRNAs using the RPAD method can enable quantitative and qualitative analyses of circRNA types and sequence composition, paving the way for the elucidation of circRNA functions.”

Nanopore Sequencing by Synthesis (Seq-by-Syn)

  • Yet Another Notable Achievement Involving George Church, ‘The Most Interesting Scientist in the World’ 
  • Team of 30 Coauthors Reports Seq-by-Syn with DNA Polymerase-Nanopore Protein Construct on an Integrated Chip
  • Challenging Improvements Needed for Commercial Reality

Prof. George M. Church. Taken from evolutionnews.org

Devotees of my blog will know that I’m prone to word play such as calling myself a “huge” fan of “tiny” nanopores for DNA sequencing, about which I’ve previously opined. They will also recall that I’m an admitted scientific admirer of George Church, who I think is The Most Interesting Scientist in the World.

Having said this, it’s not surprising that I closely follow what’s trending in nanopore sequencing, and also make an attempt to read all of Church’s papers as they get published because they are almost invariably quite interesting, involve “big ideas,” and in some new way are very educational, at least for me. Following are my comments about a recently published paper on nanopore sequencing in venerable Proceedings of the National Academy of Sciences of the United States of America (aka PNAS) wherein Church is the designated corresponding author.

Backstory

The seminal origins and early history of nanopore sequencing have been recently chronicled and criticized—then clarified—in Nature Biotech in several “To the Editor” items, which collectively provide enlightening insights into who did what when, so to speak. Those of us who are ‘Nanoporati’—a clever term tweeted by Nick Lowman—should definitely read those Nature Biotech items. For now, however, I’ll set the stage, as it were, by echoing a bit of what I’ve posted in the past for nanopores.

Patented but prophetic (i.e. no data) methods for nanopore sequencing DNA is actually a relatively old (~20 year) idea posited by Church and other creative visionaries. On the other hand, nanopore sequencing was first reduced to practice commercially not too long ago by Oxford Nanopore Technologies (ONT). Many years of delay between concept and commercialization was due to the need for gradual evolution of lots of “nanopore-ology” and sequencing biochemistry, as well as developing highly sophisticated electronics and complex algorithms for data analysis.

Nanopore Sequencing-by-Scanning (Seq-by-Scan)

Taken from rsc.org

As depicted below, and as can be best seen in a video, ONT’s commercially available MinION Seq-by-Scan system essentially involves threading a strand of DNA through a protein-based nanopore and converting resultant ionic current fluctuations into nucleotide base sequence.

While there are issues with base-calling accuracy, the remarkably small and readily portable MinION provides fast, real-time sequencing results for a wide variety of applications. These included unique or otherwise compelling Point-of-Care analyses, such as pathogen surveillance, which has been achieved in remote geographical locations and even in outer space aboard the International Space Station, as I’ve previously posted.

Nanopore Seq-by-Syn

In contrast to DNA Seq-by-Scan using a nanopore, which is challenged by pore-based differentiation of similarly sized A, G, C, and T bases, DNA Seq-by-Syn has no such limitation as it uses the DNA as a template for base-by-base (i.e. stepwise) detection of enzymatic synthesis of complementary DNA. Various Seq-by-Syn methods and challenges have been discussed elsewhere, and currently available commercial systems include those from Illumina and PacBio. The former employs nucleotides that are reversible terminators equipped with cleavable fluorescent “tags” on each base. The latter detects fluorescently labeled tags on polyphosphates released upon nucleotide incorporation.

The presently featured DNA Seq-by-Syn publication by Stranges et al., which builds upon two earlier reports cited therein, differs from the above approaches by using nanopore-based detection of mass tags rather than fluorescent tags. In principle, mass tags could afford higher accuracy compared to DNA Seq-by-Scan. However, as will now be explained, achieving improved accuracy is far easier said than done.

The general approach taken to demonstrate proof-of-concept for mass-tagged nanopore DNA Seq-by-Syn is depicted below in simplified cartoon form, but involves a true tour de force—in my opinion—of three key technologies. The first is design and synthesis of the nucleotides with appropriate mass tags, which involves very sophisticated chemistry that is best appreciated by reading detailed, extensive supporting information (SI) for Stranges et al. and SI for an earlier publication by Fuller et al. In a nutshell, these nucleotides have 5’-hexaphosphates linked to relatively large mass tags comprised of complex oligonucleotide structures.

Taken from Stranges et al. PNAS 2016

The second area of technical innovation involves attachment of a single molecule of ϕ29 DNA polymerase to each α-hemolysin (αHL) nanopore in such a manner as to retain its enzyme activity and be positioned such that every released mass tag transits through (i.e., is “captured” by) the nanopore leading to base identification by its current signature. As depicted below in two related representations, each of these heteroheptameric pores is comprised of one modified αHL subunit to which a peptidyl SpyTag moiety is attached, and six unmodified αHL subunits. This allows attachment of one ϕ29 DNA molecule modified with a cognate peptidyl SpyCatcher moiety at a predetermined, time-average distance from the pore.

Taken from Stranges et al. PNAS 2016.

The third key area of innovation deals with insertion of the enzyme-pore conjugate into a lipid bilayer residing on a silanized array (aka chip) of 256 Ag/AgCl electrodes such that there is one functional pore per electrode. Interested readers are encouraged to consult the publication for details, as well as check out related fabrication and methods patents that I found by searching Google Scholar.

Representative Results

The first image shown above depicts what base tag-specific detection would ideally look like if each of the four different bases would have a characteristic current-blockage intensity and persistence. In addition, all pores would ideally function similarly. Not surprisingly, given the stochastic nature of single-molecule systems in general, Stranges et al. found less than ideal behavior.

For example, out of 70 single pores obtained, 25 captured two or more tags, whereas only six of those pores showed detectable captures of all four tagged nucleotides. Data obtained for the pore with the most transitions between tag capture levels (i.e. the best results) is shown below, while results for the other five are given in the SI.

Taken from Stranges et al. PNAS 2016

To quote the authors:

“All four characteristic current levels for the tags and transitions between them can be readily distinguished…Homopolymer sequences in the template, and repeated, high-frequency tag capture events of the same nucleotide in the raw sequencing reads were considered a single base for sequence alignment. We recognized 12 clear sequence transitions in a 20-s period. Out of the 12 base transitions observed in the data, 85% match the template strand, showing that this method can produce results that closely align to the template sequence.” 

Interested readers need to consult and carefully read the SI for Stranges et al. regarding the interpretation of the “repeated, high frequency capture events,” such as that exhibited by C in the above current vs. time plot.

All of the above snippets in aggregate suggest to me that, while this huge amount of work has made progress toward one approach to Seq-by-Syn, many improvements will need to be made before achieving a robust system to successfully compete in the commercial sector.

Authorship, Affiliations, and Acknowledgments

The relatively large team of 30 coauthors listed for Stranges et al. include the following numbers of investigators and affiliations: 1 at Arizona State Univ., 4 at Harvard, 11 at Columbia University, and 14 at Genia Technologies, which is a Santa Clara, CA company that was acquired by Roche in 2014, and is part of Roche Sequencing.

Acknowledgments in Stranges et al. refer to support by Genia and NIH Grant R01 HG007415, which I found was awarded to coauthors George M. Church (Harvard), Jingyue Ju (Columbia), and James J. Russo (Columbia). The end of the abstract of this grant reads as follows:

“The nanopore chips will be enhanced and expanded from the current 260 nanopores to over 125,000 using advanced nanofabrication techniques. We will conduct real-time single molecule Nano-SBS on DNA templates with known sequences to test and optimize the overall system. These research and development efforts will lay the foundation for the production of a commercial single molecule electronic DNA sequencing platform, which will enable routine use of sequencing for medical diagnostics and personalized medicine.”

The conflict of interest statement in Stranges et al. indicates that the technology described therein (called “Nanopore SBS”) has been exclusively licensed by Genia, and that specified coauthors are entitled to royalties through this license. In addition, Church is a member of the Scientific Advisory Board of Genia.

Parting Comments

Long gone are the days when government-funded academic researchers thumbed their noses, if you will, at commercial development. Nowadays almost all academics parlay their government grants into university patents that get licensed to companies, usually with some type of corporate involvement of said academics.

I hasten to add that I’m not implying that NIH-funded academic research being a “seed” for corporate profitability is negative—especially in view of its Small Business Innovative Research (SBIR) program—but rather view it as a paradigm shift for the better, as it allows academic creativity to be harnessed into applications that can hopefully greatly benefit society.

In conclusion, and coming back to George Church, who I highlighted in the introduction to this blog, I must say that he might very well be the academic researcher with the longest list of technology transfer, advisory roles, and founded companies—13 to date—according to a public list that is truly mind boggling, at least to me.

As usual, your comments are welcomed.

Postscript

After writing this blog, Roche announced on December 15, 2016 that “it has officially notified Pacific Bioscience (PacBio) of its intention to terminate its [2013] agreement and efforts to develop a sequencing instrument for use in the clinical research and clinical market using their Single Molecule, Real-Time (SMRT®) technology,” about which I have commented previously. The announcement went on to say Roche would instead focus on internal development efforts” and “actively pursue multiple technologies and commercial strategies.” A GenomeWeb headline was more specific:  “Roche Will Focus on Genia’s Nanopore Technology for Dx Market After Ending Deal With PacBio.”

On December 30, 2016 it was reported that the University of California (UC) filed a patent suit against the Chief Technology Officer (CTO) at Genia, and Genia Technologies, claiming the CTO produced key inventions during his time at UC that he later assigned to Genia, but which should have automatically been assigned to UC. Stay tuned…

Frightening Fungus Among Us

  • Clinical Alert for Candida auris (C. auris) Issued by CDC
  • US Concerned About C. auris Misidentification and Drug Resistance
  • Sequencing C. auris DNA in Clinical Samples is Preferred for Identification
Strain of C. auris cultured in a petri dish at CDC. Credit Shawn Lockhart, CDC. Taken from foxnews.com

Strain of C. auris cultured in a petri dish at CDC. Credit Shawn Lockhart, CDC. Taken from foxnews.com

When I was a kid and didn’t know better, there was a supposedly funny rhyme that “there’s fungus among us.” While this saying is thankfully passé nowadays, the growing number of infections by a formerly obscure but deadly fungus is frightening. This so-called “superbug” is an antibiotic-resistant fungus called Candida auris (C. auris) that’s worth knowing about, and is the fungal focus of this blog.

First, Some Fungus Facts

Fungi are so distinct from plants and animals that they were allotted a biological ‘kingdom’ of their own in classification of life on earth, although that was only relatively recently, i.e. 1969. There are 99,000 know fungi, which exist in a wide diversity of sizes, shapes and complexity that extends from relatively simple unicellular microorganisms, such as yeasts and molds, to much more complex multicellular fungi, such as mushrooms and truffles.

It was previously thought that genomes of all fungi are derived from the genome of the model fungus Saccharomyces cerevisae, which has been used in winemaking, baking and brewing since ancient times. However, genome sequencing of more than 170 fungal species has revealed that, while the genome size of S. cerevisae is only ~12 Mb, seven species of fungus have genome sizes larger than 100 Mb. This is attributed to various evolutionary pressure-factors generating transposable elements, short sequence repeats, microsatellites, and genome duplication, and noncoding DNA.

Fungal cell walls are made up of intertwined fibers mostly comprised of long chains of chitosan, the same tough compound found in the exoskeletons of animals such as spiders, beetles and lobsters. The chitin in fungal cells is entangled with glucans and other wall components, such as proteins, forming a mass that protects the cell membrane behind it—and posing a formidable barrier against antifungal drugs.

Taken from Wikipedia.org

Taken from Wikipedia.org

In researching whether there are any nucleic acid drugs against fungi, I found one early patent by Isis (now Ionis) Pharmaceuticals for use of antisense phosphorothioate-modified oligonucleotides for the treatment of Candida infections, but virtually no other reports. I suspect that will change in the future as pathogenic fungi and other disease-causing microbes become more resistant to conventional drugs.

Fungal infections of the skin are very common and include athlete’s foot, jock itch, ringworm, and yeast infections. While these can usually be readily treated, infections caused by pathogenic fungi have reportedly risen drastically over the past few decades. Moreover, with the increase in the number of immunocompromised (burn, organ transplant, chemotherapy, HIV) patients, fungal infections have led to alarming mortality rates due to ever increasing phenomenon of multidrug resistance.

Segue to a Serious Situation

Emergence of drug-resistant fungi is, in part, the segue to the serious story of the present blog. The other part being incorrect identification of a certain fungus as being a common candida yeast, which is not only scary but seemingly inexcusable in today’s era of highly accurate PCR-based assays to accurately identify microorganisms. Here’s the situation in a nutshell.

  1. auris infection, which is associated with high mortality and is often resistant to multiple antifungal drugs, was first described in 2009 in Japan but has since been reported in countries throughout the world. Unlike many Candida infections, C auris is a hospital-acquired infection that is contracted from the environment or staff of a healthcare facility, and it can spread very quickly.

To determine whether C. auris is present in the United States and to prepare for the possibility of transmission, the Centers for Disease Control (CDC) and Prevention issued a clinical alert in June 2016 requesting that C. auris cases be reported.

(A) MALDI-TOF schematic; (B) mass spectra from three C. parapsilosis; and (C) two C. bracarensis isolates. Taken from researchgate

(A) MALDI-TOF schematic; (B) mass spectra from three C. parapsilosis; and (C) two C. bracarensis isolates. Taken from researchgate

This official alarm bell, if you will, was triggered by the following facts:

  • Many isolates are resistant to all three major classes of antifungal medications, a feature not found in other clinically relevant Candida
  • auris identification requires specialized methods such as a MALDI-TOF mass spectrometry or sequencing the 28s ribosomal DNA, as pictured below.
  • Using common methods, auris is often misidentified as other yeasts, which could lead to inappropriate treatments.

The CDC subsequently found that seven cases were identified in Illinois, Maryland, New York and New Jersey. Five of seven isolates were either misidentified initially as C. haemulonii or not identified beyond being Candida. Five of seven isolates were resistant to fluconazole; one of these isolates was resistant to amphotericin B, and another isolate was resistant to echinocandins. While no isolate was resistant to all three classes of antifungal medications, emergence of a new strain of C. auris that is would pose a serious public health issue.

Sequencing 28s ribosomal DNA. Taken from microbiologiaysalud.org

Sequencing 28s ribosomal DNA. Taken from microbiologiaysalud.org

Based on currently available information, the CDC concluded that these cases of C. auris were acquired in the U.S., and several findings suggest that transmission occurred:

  • First, whole-genome sequencing results demonstrate that isolates from patients admitted to the same hospital in New Jersey were nearly identical, as were isolates from patients admitted to the same Illinois hospital.
  • Second, patients were colonized with auris on their skin and other body sites weeks to months after their initial infection, which could present opportunities for contamination of the health care environment.
  • Third, auris was isolated from samples taken from multiple surfaces in one patient’s health care environment, which further suggests that spread within health care settings is possible.

A related Fox News story adds that C. auris was found on a patient’s mattress, bedside table, bed rail, chair, and windowsill. Yikes!

While the above situation in the U.S. might not seem particularly worrisome to you, the potential for emergence of more infectious C. auris strains with higher lethality should be of concern. That has already reportedly occurred in several Asian countries and South Africa. Obviously, deployment of the best available methods for pathogen identification can, in principle, lessen the likelihood of the emergence and/or spread of C. auris in the U.S. and other countries.

Case for Point-of-Care C. auris Nanopore Sequencing?

Taken from extremtech.com 

Taken from extremtech.com

Regular readers of my previous blogs know that I’m an enthusiastic fan of the Oxford Nanopore Technologies minION sequencer, which is proving to be quite useful for characterizing pathogens in very remote regions on Earth—and even on the International Space Station to diagnose astronaut infections! Notwithstanding various current limitations for minION sequencing of microbes, it seems to me that it would be relatively straightforward to generate minION data for many available samples of pathogenic fungi and genetically related microbes to assess the feasibility using minION for faster, cheaper, better unambiguous identification of C. auris minION in centralized or Point-of-Care applications.

Taken from rnaseq.com

Taken from rnaseq.com

If you think this suggestion is farfetched, think again, after checking out these 2016 publications using minION:

The 51.4-Mb genome sequence of Calonectria pseudonaviculata for fungal plant pathogen diagnosis was obtain using minION.

The first report of the ~54 Mb eukaryotic genome sequence of Rhizoctonia solani, an important pathogenic fungal species of maize, was derived using minION.

Sequence data is generated in ~3.5 hours, and bacteria, viruses and fungi present in the sample of marijuana are classified to subspecies and strain level in a quantitative manner, without prior knowledge of the sample composition.

CDC on C. auris Status and FAQs

In the interest of concluding this blog with the most up-to-date and authoritative information, I consulted the CDC website and found statements and replies to FAQs that are well worth reading at this link.

As a scientist, my overriding question concerns the lack of adoption of improved microbiological methods by hospitals and clinics. The above noted misidentifications of C. auris infections resulting from use of flawed lab analyses seems unacceptable. Although I don’t know all the facts or statistics to generalize, I suspect that there are other incorrect lab analyses due to use of outdated methods. On the other hand, I’m hopeful that, with the FDA’s widely touted Strategic Plan for Moving Regulatory Science into the 21st Century, the section entitled Ensure FDA Readiness to Evaluate Innovative Emerging Technologies—think nanopore sequencing—becomes actionable, sooner rather than later.

Changing established—dare I say entrenched—clinical lab tests is not simple or easy, but if it doesn’t begin it won’t happen, about which I’m quite certain. I can only wonder why development of infectious disease analytical methods and treatments seem to require a crisis. Sadly, I think it boils down to the complexities and socio-political dynamics of who pays.

Frankly, it’s my personal opinion that maybe it’s time Thomas Jefferson’s philosophy about hammering guns into plows is directed to health care.

Postscript

After writing this blog, I learned that T2 Biosystems has received FDA approval to market in the U.S. the first direct blood test for detection of five yeast pathogens that cause bloodstream infections: Candida albicans and/or Candida tropicalis, Candida parapsilosis, Candida glabrata and/or Candida krusei.

Yeast bloodstream infections are a type of fungal infection that can lead to severe complications and even death if not treated rapidly. Traditional methods of detecting yeast pathogens in the bloodstream can require up to six days, and even more time to identify the specific type of yeast present. The T2Candida Panel and T2Dx Instrument (T2Candida) can identify these five common yeast pathogens from a single blood specimen within 3-5 hours.

T2Candida incorporates technologies that break the yeast cells apart, releasing the DNA for PCR amplification for detection by greatly simplified, miniaturized nuclear magnetic resonance (NMR) technology, as can be seen in this video.

In my opinion, this fascinating new technology is another example of what could be rapidly deployed toward detecting C. auris.

Sequencing Trifecta for Top 10 Innovations of 2015

  • Sequencing Sweeps The Scientist’s Top 3
  • Diverse Array of Research and Diagnostic Products Round Out Top 10
  • I Predict 3 Winners for 2016. What Are Yours?
Taken from the-scientist.com.

Taken from the-scientist.com.

Welcome to my first blog of the New Year, 2016! There is a trove of topics in my queue of blogs, and I invite you to check them out every other Tuesday throughout the year. As in the past, this first blog of the year comments on the Top 10 Innovations in 2015 that were picked by a panel of judges and published last month in The Scientist. As a side note, you can also peruse TriLink’s top products of 2015 and predictions for 2016 by clicking here.

When you read about these winners, you’ll find out that 1st, 2nd and 3rd place involve sequencing—a trifecta in parimutuel betting on horse races—that were kind of a sure thing (to continue my analogy to betting) based on sequencing products also being in the top spots in the previous year picks. This preeminence of sequencing will likely continue, as I’ll explain at the end of this blog with my win, show and place bets for next year.

Taken from wikipedia.org.

From wikipedia.org.

Continue reading

Big, Bigger, Biggest—Genomics Projects Go Democratic

  • 1,000 Genomes Project is Big
  • 10,000 Genomes Project is Bigger
  • 100,000 Genomes Project is Biggest—so Far
  • Will 1,000,000 Genomes be Next?

This blog on genomics projects going democratic has—rest assured—nothing to do with US presidential election politics that are already receiving (too much) 24/7 coverage—but rather genomics going from singular to pluralistic. Let me frame this revolutionary change another way to clarify: the much heralded sequencing of “the human genome” (singular) announced in 2001—by competing public and private initiatives—used mixtures of DNA from multiple donors, i.e. “the genome” was actually “the genomes,” all of which are different—in some way. These differences are what make each of us genetically unique. Consequently—and enabled by ever faster and cheaper DNA sequencing—there are increasingly large projects aimed at identifying these genetic variations (aka genotypes or polymorphisms) for association with health or disease status (aka phenotypes). To me, this fundamentally important trending science is definitely blogworthy.

Populations are comprised of genetically unique individuals. Taken from my Wakulla.com.

Populations are comprised of genetically unique individuals. Taken from my Wakulla.com.


Continue reading

Small RNA is Big Science

  • Most Top-5 Citations in Clinical Chemistry are MicroRNA (miRNA) Biomarkers
  • miRNA Biomarker Bonanza is Predicted by Panel of Experts, Although No miRNA Biomarkers Have Yet Been Approved by FDA
  • Plethora of Potential Short Regulatory RNA Exists Beyond the Typical miRNA Microcosm

I’m always looking for new and hopefully engaging topics to comment on, and a recent “Best of Clinical Chemistry” item featured in a special issue of Clinical Chemistry definitely caught my attention. I wasn’t surprised by MIQE Guidelines being at the top, given that these are the “bible” for doing accurate quantitative PCR (qPCR) that has become a seemingly ubiquitous molecular assay for clinical studies. However, I was totally surprised that the next four “best of” all involved microRNA (miRNA)! Hence today’s blog about these small RNA being big science—play on words intended (although properly speaking I should say short rather than small). Continue reading

Profiling Pseudouridine

  • Two New Methods for Sequencing Pseudouridine Leverage Old Chemistry
  • New Methods Reveal ‘Rewiring’ of Genetic Code by Post-Transcriptional Pseudouridination
  • Exciting Future for New Analytical Methods for Modified mRNA
  • Be Sure to Read the Very End of the Blog for a Special Offer!

At the risk of seeming enamored with pseudouridine, which I previously proclaimed—with justifications—to be The 2014 Modified Nucleobase of the Year, recent reports about this fascinating base lead me now to feature it here once again. In that past post, it was pointed out that uridine, which is incorporated into all RNA during transcription of genomic DNA, differs from pseudouridine—historically abbreviated by the Greek symbol Ψ –by how one nitrogen (shown in red below) switches place with a carbon for bonding to the ribose ring. It was also noted that this switch has been long known to be carried out after transcription (aka post-transcriptionally) by an enzyme called—appropriately—pseudouridine synthase, the exact mechanistic details for which remain controversial. This post-transcriptional process that converts U to Ψ at specific positions in RNA is called pseudouridination.
Continue reading

The Most Interesting Scientist in the World: George M. Church

  • Mind Boggling Breadth and Significance of Scientific Publications
  • Serial Entrepreneur and Science Advisor to Many Companies
  • Radical Advocate of Total Openness for Personal Genomics

While seeing for the umpteenth time a Dos Equis beer commercial featuring The Most Interesting Man in the World, I was suddenly inspired to write a blog about The Most Interesting Scientist in the World. After scrolling and polling my memory to decide who that would be, it was an easy decision to pick George M. Church, professor of genetics at Harvard. As I’ll briefly highlight herein, Prof. Church’s contributions continually span a mind boggling spectrum of science that cuts across academic theory, ground breaking “how to” methods, serial entrepreneurship, and—perhaps most importantly—radical openness for personal genomics.

George M. Church and The Most Interesting Man in the World: ‘I don’t always read science, but when I do it’s by George M. Church.’ (taken from Bing Images)

Continue reading