Phoenix in Mythology and Sequencing

  • Like a Phoenix, Helicos Sequencing is Being Reborn
  • Direct Genomics in China to Launch the Genocare Clinical Sequencer
  • SeqLL in the USA to Launch Benchtop tSMS Sequencer

A phoenix as depicted by F.J. Bertuch (1747–1822). Taken from Wikipedia.org

In ancient Greek mythology, a phoenix is a bird that is cyclically regenerated or reborn by arising from the ashes of its predecessor, which dies in a show of flames and combustion. In contrast to a phoenix, modern biotech methods generally “die” in utility by being displaced with faster, better, and/or cheaper methods rather than undergoing “rebirth” in the context of a new application. However, a method developed by a company named Helicos (scarily close to Helios associate with a phoenix) may prove to be a rare exception. Perhaps this is destiny, but I digress…

Helicos Sequencing

Successful Sanger sequencing of a human genome in the early 2000s spawned numerous efforts to develop faster, better, and/or cheaper methodology to enable genomic analysis on a routine basis. Among the early contenders there was Helicos BioSciences, which was founded in 2003 by several principals including then—and still—uber-famous Stephen Quake.

Helicos sequencing technology, which is depicted below and outlined elsewhere, was especially attractive because it was “true” single-molecule sequencing (i.e. sample prep did not require prior PCR or other amplification, thus greatly simplifying the workflow). Moreover, the technology uniquely allowed direct RNA sequencing, thus obviating the need to first convert RNA into cDNA.

Main steps for primer(P)-based, single-color (Cy3 dye) Helicos sequencing, in this example using two passes. Taken from Harris et al. Science (2008)

3’-Unblocked reversible terminator. Taken from Chen et al. (2013)

Details for how this sequencing-by-synthesis occurs can read in various proof-of-concept publications. However, it’s worth noting here that the 3’-unblocked reversible terminator nucleotide triphosphate monomers have a cleavable linker attached to a detectable dye. Helicos referred to these as “Virtual Terminator” nucleotides since they are efficiently incorporated by a polymerase yet block incorporation of a second nucleotide on a homopolymer template.

So, with these methodological advantages going for it, why did Helicos file for bankruptcy in 2012? Press coverage at that time stated ‘rough financial sledding and tough competition from rival next-generation sequencing companies.’ In my humble opinion, this lack of commercial success was primarily due to the HeliScope Genetic Analysis System (pictured below) being way too big (think upright freezer-refrigerator), far too expensive ($1,350,000), and its ~35-base reads too short on performance—pun intended.

Two Phoenix-Like Versions of Helicos Sequencing

Fast forwarding about five years from the 2012 bankruptcy filing by Helicos brings us to recent reports of two independent efforts to bring back Helicos sequencing in commercially viable formats and contexts, think Phoenix rising from the ashes.

Jiankui He and the GenoCare sequencer (credit Xinjie Tian). Taken from Cyranoski Nature Biotechnology (2016)

The first of these is led by Jiankui He, Founder/CEO of Direct Genomics in Shenzhen, Guangdong, China, as well as Associate Professor at South University of Science and Technology of China in Shenzhen. He, coincidentally, was a postdoc with Helicos cofounder Stephen Quake, who is reported to lead the scientific advisory board for this new company.

The company’s website homepage states the following:

“Direct Genomics is providing physicians with the first single molecule sequencer built exclusively for the clinic. The technology simplifies genome sequencing by reading individual DNA and RNA molecules directly from patient’s blood or tissue samples, which delivers significant improvement in cost and speed. Together with clinicians, Direct Genomics is making genetics an affordable part of everyday patient care.”

Perusal of scant technical information on the company’s website suggests to me that a smaller sized, TIRF-optics-enabled instrument running Helicos-type sequencing has been developed. A story about Direct Genomics by David Cyranoski in Nature Biotechnology states that $100 “clinical sequencing” is being targeted, with a blood-draw to report turnaround time of 20 hours. A very recent publication I found provides details for resequencing the Escherichia coli genome by the Direct genomics platform named GenoCare.

The company’s website lists the following clinical applications:

  • Non-invasive prenatal testing (NIPT)
  • Tumor diagnosis
  • Early-stage cancer prediction
  • Pre-implantation genetic diagnosis (PGD)

The second Phoenix-like rebirth of Helicos sequencing has been developed by SeqLL, which was co-founded in 2013 by William St. Laurent and Daniel Jones, who previously held various technical positions at Helicos. Statements and a video on SeqLL’s website indicate to me that the sequencing technology is essentially that originally developed and patented by Helicos, which is still trademarked as True Single Molecule Sequencing (tSMS™).

William St. Laurent. /   Daniel Jones. Taken seqll.com

SeqLL has been operating as a tSMS™ service provider, but in October 2016 announced the launch of the tSMS™ System Early Access Program giving researchers access to its new benchtop system “designed to deliver unparalleled quantitative RNA and specialty DNA sequencing results to both academic and industry research partners.” I should add that a big, strong bench is needed given that the physical specs are 30 x 30 x 60 inches and 1,000 pounds! Nevertheless, SeqLL recently announced an SBIG grant for improving its direct RNA sequencing technology, which I think could prove to be a driver for adoption.

In conclusion, I think it’s very interesting to see Helicos sequencing coming back to life, if you will, in not one but two different commercial contexts, both of which will hopefully be successful. This despite current ‘tough competition from rival next-generation sequencing companies,’ as observed in the 2012 bankruptcy story about Helicos mentioned above. First and foremost, among that competition is Oxford Nanopore, which I’ve blogged about previously, and whom offers single-molecule sequencing that seems to me to be faster, better, and cheaper for both DNA and RNA, directly.

As usual, your comments are welcomed.

Postscript

After this blog was written, it was reported in GenomeWeb that Direct Genomics plans to deliver 50 instruments this year to SinoTech Genomics, a startup based in Shanghai that offers both clinical and research sequencing services. Direct Genomics CEO Jiankui He is quoted as saying that ‘SinoTech Genomics [is] committed to ultimately purchasing 700 GenoCare platforms,’ and that Direct Genomics ‘has the capacity of producing around 1,000 GenoCare instruments per year,’ which would be very impressive based on past operational experience with manufacturing Sanger sequencers at ABI.

The piece went on to report that Direct Genomics also ‘aims to launch GenoCare in the US in September.’ Regarding what’s inside the box, so to speak, ABI veteran Bill Efcavitch, who previously served as chief technology officer of Helicos, is quoted as saying that ‘the main difference between the former Helicos technology and the GenoCare platform is in the hardware. It’s completely different engineering.’ He added, however, that it still uses Helicos’ virtual terminator chemistry.

Ocean ‘Dandruff’ DNA to Better Study Marine Biology

  • DNA Barcoding for all Organisms has Numerous Applications
  • DNA Barcodes from Water Samples Greatly Aide Marine Biologists
  • Aquatic Environmental DNA (eDNA) Proves to be Informative ‘Dandruff’

Human DNA identity analysis is now commonplace methodology that’s frequently featured in newspaper stories, TV crime series, or “who dun it” movies. The same principle (i.e. using a characteristic DNA pattern or signature) applies to identification of all animals, birds, insects, and microbes. Actually, DNA barcoding extends to any organism, whether it is alive or has been dead for hundreds of thousands of years (so long as it’s preserved by fossilization).

Taken from gajitz.com

Marine biologists face a serious challenge with accounting for very diverse forms of marine life that exists in a mindboggling huge volume of water. Consequently, it’s not surprising that analysis of water-borne, marine DNA barcodes—as proxies for going to and counting fish—is rapidly trending in utility and importance. Known formally as environmental DNA (eDNA), the aquatic version has been humorously referred to as ocean ‘dandruff’ by Christopher Jerde of the University of Nevada in Reno (which, ironically, is landlocked and distant from any ocean.) But I digress. Before diving further (pun intended) into ocean dandruff, let’s briefly review the background of DNA barcoding.

DNA Barcodes 101

Prof. Paul Herbert. Taken from uoguelph.ca

In 2003, Prof. Paul Herbert and coworkers in the Department of Zoology at the University of Guelph in Canada published a seminal study titled Barcoding animal life: cytochrome c oxidase subunit 1 (CO1) divergences among closely related species that fundamentally changed the field of taxonomy. In a nutshell, Herbert’s team showed it was feasible to classify millions of species based only on DNA sequence of the mitochondrial gene CO1. In the intervening, relatively short amount of time, there have been thousands of publications dealing with applications and extensions of this concept, which is now recognized to be very powerful and promising albeit with some limitations.

Typically, DNA barcodes are identified by sequencing after PCR amplification of one or more specific genetic loci such as CO1. Following proof that a DNA barcode can differentiate the species of interest, single- or multiplex quantitative PCR (qPCR) can be used to enumerate relative amounts of sample from the field.

The advent of high-throughput sequencing technologies applicable to complex mixtures of individually tagged samples then gave rise to “metabarcoding,” about which interested readers can consult many publications for specific topics.

Craig Venter steers his research yacht, Sorcerer II, under the Sydney Harbour Bridge in his quest to collect microbes from the world’s waters. Photo: Dallas Kilponen. Taken from smh.com.au

BTW, among the many pioneering scientific ventures by uber-famous Craig Venter, is his Global Ocean Sampling Expedition aboard his research yacht, Sorcerer II. The expedition is a quest to unlock the secrets of the oceans by sampling, sequencing and metabarcoding DNA of all (or most) microorganisms living in these waters.

Lest you think this was a well-intended but unproductive journey—some say junket—by Venter and coworkers, here’s a link to peruse 16 resultant publications that I found by searching PubMed. To watch and listen to Venter talk about this work, you can click here for an educational and entertaining—as usual with Venter—TED Talk on Sampling the Ocean’s DNA that’s had over 550,000 views!

Ocean ‘Dandruff’

Now that we’ve covered the basics of DNA barcoding and metabarcoding, let’s turn back to ocean dandruff. Dandruff, simply put, is dead skin cells. Using dandruff as an intended witty metaphor for ocean eDNA is a bit misleading as marine eDNA is comprised of a complex mixture of cellular matter from scales, feces, decomposing tissue, etc. of fish and all other present or past sea creatures. Consequently, the design and specificity of primers for PCR is of paramount importance for obtaining—let alone interpreting—DNA barcodes based on fragment size or sequence.

As reported by Miya et al., monitoring the occurrence of fish species-specific eDNA PCR fragments (~70–300 bp) has traditionally used conventional electrophoretic gel separation and detection. More recently, qPCR using fluorogenic probes has been employed owing to the method’s sensitivity, specificity and potential to quantify the target DNA. For example, it has been possible to accurately estimate the biomass of common carp in a natural freshwater lagoon using qPCR of eDNA concentrations and biomass in aquaria and experimental ponds.

Miya et al. also describe the development of a set of PCR primers for metabarcoding mitochondrial DNA of 880 species of fish. They sampled eDNA from four tanks with known species compositions, prepared dual-indexed libraries and performed paired-end sequencing. Out of the 180 marine fish species contained in the four tanks, they detected 168 species (93.3%) distributed across 59 families and 123 genera. That’s quite an impressive accomplishment.

Ocean Dandruff Case Studies

Since there are so many fish-related applications of DNA barcodes, I’ve selected several recent examples that are indicative of the utility of ocean ‘dandruff’—and are quite interesting, in my opinion. The first case in point exemplifies how eDNA can be used to deal with rare and endangered species, which are either very hard to find or can be dangerously distressed by catching to obtain samples.

Green SturgeonBergman et al. report that a decline in abundance of North American Green Sturgeon located in California’s Central Valley has led to its listing as Threatened under the Federal Endangered Species Act in 2006. While visual surveys of spawning by these Green Sturgeon are effective at monitoring fish densities in concentrated pool habitats, results do not scale well—pun intended. By contrast, eDNA provides a relatively quick, inexpensive tool to efficiently identify and monitor Green Sturgeon DNA.

Taken from mthsecology.wikispaces.com

These investigators concluded that follow-on work based on this first-ever eDNA study of Green Sturgeon has the potential to provide better knowledge of the spatial extent of Green Sturgeon spawning that could help identify previously unknown spawning habitats and discover factors influencing habitat usage, guiding future conservation efforts.

Monterey Bay—The second case study, by Port et al., involves taking stock of the marine mammals and fish in Monterey Bay using eDNA and, importantly, comparing the results obtained to those from traditional dive surveys.

In brief, this team of researchers from several universities and the Monterey Bay Aquarium Research Institute found that eDNA assessments picked up almost all the organisms scuba divers spied underwater—plus many more that human eyes missed. Here’s some detail on how they did this.

At each scuba survey location as well as at sites offshore, ~1 gallon of water was sampled several feet above the bottom. Four types of habitats were sampled: sea grass beds, Monterey Bay’s unique “Kelp Forest,” sandy areas and rocky reefs. Onshore, in a “clean” (DNA-free) lab, these water samples were filtered to collect cells containing eDNA for storage at −80 °C until eDNA extraction at a university clean lab. A vertebrate‐specific primer set targeting a small region of the mitochondrial DNA 12S rRNA gene was used for PCR followed by gel purification.

Researchers collecting water in Monterey Bay for eDNA analysis. Courtesy Jesse Port. Taken from mercurynews.com

After quantification, pooled amplicons (each having a sample index sequence) were paired-end sequenced on the Illumina MiSeq platform using a 20% PhiX spike‐in control to improve the quality of low‐diversity samples. The conclusions are worth quoting because—in my opinion—the findings represent a new era in marine biology based on nucleic acid analysis:

“We find spatial concordance between individual species’ eDNA and visual survey trends, and that eDNA is able to distinguish vertebrate community assemblages from habitats separated by as little as ~60 meters. eDNA reliably detected vertebrates with low false‐negative error rates (1/12 taxa) when compared to the surveys, and revealed cryptic species known to occupy the habitats but overlooked by visual methods. This study also presents an explicit accounting of false negatives and positives in metabarcoding data, which illustrate the influence of gene marker selection, replication, contamination, biases impacting eDNA count data and ecology of target species on eDNA detection rates in an open ecosystem.”

Restated more simply, eDNA analysis of the water picked up 11 of the 12 fish and marine mammals that the divers observed, and—importantly—identified 18 additional animals the divers missed! The efficiency and improvement offered by eDNA analysis compared to traditional seek-and-count methods has been echoed in an editorial I found by Hoffmann et al. titled, tongue-in-cheek, Aquatic biodiversity assessment for the lazy.

Invasive Gobies—The third and final case study deals with detection of invasive, non-native fish to assess whether eDNA can provide a better advanced warning system for detecting these unwanted creatures and implementing eradication steps.

Gobies are an invasive fish species that has colonized freshwaters and brackish waters in Europe and North America. One of them, the round goby (Neogobius melanostomus), pictured below, is among the worst invaders in Europe. Current methods to detect the presence of these gobies are labor intense and not very sensitive. Consequently, populations are usually detected only when they have reached high densities and when management or containment efforts are futile.

Taken from animal.memozee.com

To improve monitoring, Swiss and Canadian collaborators developed an assay based on the detection of eDNA in river water, without detecting any native fish species, which is obviously an important assay criterion. The eDNA assay requires less time, equipment, manpower, skills, and financial resources than conventional monitoring methods such as electrofishing, angling or diving. Samples can be taken by novices and the assay can be performed by any molecular biologist on a conventional PCR machine. Therefore, this assay enables environment managers to map invaded areas independently of fishermen’s reports and fish community monitoring.

I could go on and on with examples of utility and the many advantages provided by eDNA for marine biology, but I’m sure you get the picture. I hope that you agree with me that eDNA analysis is a very valuable type of trending nucleic acid-based methodology.

As usual, your thoughts or comments are welcomed.

SaveSave

SaveSave

SaveSave

SaveSave

You and Your Microbiome – Part 3

  • Top 10 Cited Microbiome Publications are Summarized
  • Welcome to the New World-View of “Holobionts”
  • TriLink Products Cited in Numerous Microbiome Publications

It’s been almost two-and-a-half years since posting Part 2 in this series on microbiomes, which I first began in 2013, and the publication rate keeps accelerating, with about 7,000 articles indexed in PubMed in 2016—way more than the mere 35 in 1996. This vast amount of new microbiome information being published annually led me to use the following search strategy to guide my selection of what’s trending in importance for microbiomes.

Basically, I used Google Scholar to search for publications since 2015 that had the term “microbiome” in the title and, among those items found, used the number of citations as a quantitative indicator of interest, importance, and/or impact. But before summarizing my findings for these Top 10 Most Cited Microbiome articles, here’s what you can read in my previous two postings on microbiomes in case you missed them or want to refresh your memory:

Proportion of cells in the human body. You are comprised of much more than what you think you are! Taken from amnh.org

Meet Your Microbiome: The Other Part of You

  • What’s in your microbiome? Why does it matter?
  • Next-generation sequencing is revealing that you and your bacterial microbiome have a biological relationship.

You and Your Microbiome – Part 2

  • Global obesity epidemic is linked to gut microbiome.
  • Investments in microbiome-based therapies are increasing.

Top 10 Cited Microbiome Publications 

The following articles, which were all published in 2015, are listed in decreasing order of the number of citations in Google Scholar. Titles are linked to original documents for interested readers to consult, and synopses represent my attempt to capture essential findings.

1. Precision microbiome reconstitution restores bile acid mediated resistance to Clostridium difficile (369 citations)

C. difficile (From lactobacto.com)

Many antibiotics destroy intestinal microbial communities and increase susceptibility to intestinal pathogens such as Clostridium difficile, which is a major cause of antibiotic-induced diarrhea in hospitalized patients. It was found that Clostridium scindens, a bile acid 7-dehydroxylating intestinal bacterium, is associated with resistance to C. difficile infection and, upon administration as a probiotic, enhances resistance to C. difficile infection.

2. Dynamics and stabilization of the human gut microbiome during the first year of life (298 citations)

Applying metagenomic sequencing analysis on fecal samples from a large cohort of Swedish infants and their mothers, the gut microbiome during the first year of life was characterized to assess the impact of mode of delivery and feeding. In contrast to vaginally delivered infants, the gut microbiota of infants delivered by C-section showed significantly less resemblance to their mothers. Nutrition had a major impact on early microbiota composition and function, with cessation of breast-feeding, rather than introduction of solid food, being required for maturation into an adult-like microbiota.

Graphical abstract by Bäckhed et al. Cell Host & Microbe (2015)

3. Structure and function of the global ocean microbiome (238 citations)

Taken from Sunagawa et al. Science (2015)

Metagenomic sequencing data from 243 ocean samples from 68 locations across the globe was used to generate an ocean microbial reference gene catalog with >40 million novel sequences from viruses, prokaryotes, and picoeukaryotes. This ocean microbial core community has 73% of its abundance shared with the human gut microbiome despite the physicochemical differences between these two ecosystems.

4. Serotonin, tryptophan metabolism and the brain-gut-microbiome axis (200 citations)

Taken from factvsfitness.com

The brain-gut axis is a bidirectional communication system between the central nervous system and the gastrointestinal tract. Serotonin functions as a key neurotransmitter at both terminals of this network. Accumulating evidence points to a critical role for the gut microbiome in regulating normal functioning of this axis. The developing serotonergic system may be vulnerable to differential microbial colonization patterns prior to the emergence of a stable adult-like gut microbiota. At the other extreme of life, the decreased diversity and stability of the gut microbiota may dictate serotonin-related health problems in the elderly. Therapeutic targeting of the gut microbiota might be a viable treatment strategy for serotonin-related brain-gut axis disorders.

5. The dynamics of the human infant gut microbiome in development and in progression toward type 1 diabetes (184 citations)

Taken from dtc.ucsf.edu

Colonization of the fetal and infant gut microbiome results in dynamic changes in diversity, which can impact disease susceptibility. To examine the relationship between human gut microbiome dynamics throughout infancy and type 1 diabetes (T1D), a cohort of 33 infants genetically predisposed to type 1 diabetes (T1D) was examined to model trajectories of microbial abundances through infancy. A marked drop in diversity was observed in T1D progressors in the time window between seroconversion and T1D diagnosis, accompanied by spikes in inflammation-favoring organisms, gene functions, and serum and stool metabolites. These trends in the human infant gut microbiome thus distinguish T1D progressors from nonprogressors.

6. The microbiome of uncontacted Amerindians (150 citations)

Taken from robertharding.com

Sequencing of fecal, oral, and skin bacterial samples was used to characterize microbiomes and antibiotic resistance genes (resistome) of members of an isolated Yanomami Amerindian village in the Amazon with no documented previous contact with Western people. These Yanomami harbor a microbiome with the highest diversity of bacteria and genetic functions ever reported in a human group. Despite their isolation, presumably for >11,000 years since their ancestors arrived in South America, and no known exposure to antibiotics, they harbor bacteria that carry functional antibiotic resistance (AR) genes, including those that confer resistance to synthetic antibiotics. These results suggest that westernization significantly affects human microbiome diversity and that functional AR genes appear to be a feature of the human microbiome even in the absence of exposure to commercial antibiotics.

7. Akkermansia muciniphila and improved metabolic health during a dietary intervention in obesity: relationship with gut microbiome richness and ecology (136 citations)

Taken from dtc.ucsf.edu

Individuals with obesity and type 2 diabetes differ from lean and healthy individuals in their abundance of certain gut microbial species and microbial gene richness. This study in humans found that, at baseline, A. muciniphila was inversely related to fasting glucose, waist-to-hip ratio and subcutaneous adipocyte diameter. Subjects with higher gene richness and A. muciniphila abundance exhibited the healthiest metabolic status. Individuals with higher baseline A. muciniphila displayed greater improvement in insulin sensitivity markers and other clinical parameters. A. muciniphila is therefore associated with a healthier metabolic status and better clinical outcomes for overweight/obese adults.

8. Host biology in light of the microbiome: ten principles of holobionts and hologenomes (132 citations)

Today, animals and plants are no longer viewed as autonomous entities, but rather as “holobionts“, composed of the host plus all of its symbiotic microbes. The term “holobiont” refers to symbiotic associations throughout a significant portion of an organism’s lifetime, with the prefix holo- derived from the Greek word holos, meaning whole or entire. Holobiont is now generally used to mean every macrobe and its numerous microbial associates, and the term importantly fills the gap in what to call such assemblages. Symbiotic microbes are fundamental to nearly every aspect of host form, function, and fitness, including traits that once seemed intangible to microbiology: behavior, sociality, and the origin of species. Microbiology thus has a central role of in the life sciences, as opposed to a “bit part.”

Taken from researchgate.net

9. The infant nasopharyngeal microbiome impacts severity of lower respiratory infection and risk of asthma development (131 citations)

The nasopharynx (NP) is a reservoir for microbes associated with acute
respiratory infections (ARIs). Lung inflammation resulting from ARIs during infancy is linked
to asthma development. The NP microbiome examination during the first year of life in a cohort of 234 children led to characterization of viral and bacterial communities, and documenting all incidents of ARIs. Most infants were initially colonized with Staphylococcus or Corynebacterium before stable colonization with Alloiococcus or Moraxella. Transient incursions of Streptococcus, Moraxella, or Haemophilus marked virus-associated ARIs. Early asymptomatic colonization with Streptococcus was a strong asthma predictor, and antibiotic usage disrupted asymptomatic colonization patterns.

10. Insights into the role of the microbiome in obesity and type 2 diabetes (128 citations)

Obesity and type 2 diabetes (T2D) are associated with changes in the composition of the intestinal microbiota, and the obese microbiome seems to be more efficient in harvesting energy from the diet. Lean male donor fecal microbiota transplantation (FMT) in males with metabolic syndrome resulted in a significant improvement in insulin sensitivity and increased intestinal microbial diversity, including a distinct increase in butyrate-producing bacterial strains. Such differences in gut microbiota composition might function as early diagnostic markers for the development of T2D. The rapid development of FMTs provides hope for novel therapies in the future.

TriLink Products Cited in Microbiome Publications

It always amazes me to learn about the many ways TriLink products are used in basic and applied science. When I searched Google Scholar for publications containing “TriLink [and] microbiome” I found 21 items, among which the following were selected to illustrate diversity of these product types and uses:

Takeaway Messages

In summary, several takeaways should now be apparent to you. The first takeaway is that there is continuing explosive growth of microbiome publications in all manner of life-related research, as evidenced by both the introductory PubMed graph and wide spectrum of subjects covered by the Top 10 Cited Publications mentioned above.

The second takeaway is best summarized in publication #8 above, “[t]oday, animals and plants are no longer viewed as autonomous entities, but rather as ‘holobionts’, composed of the host plus all of its symbiotic microbes.” Each of us is indeed inextricably comprised of our human cells and symbiotic microbiota in or on us—like it or not, and for better or worse.

The final takeaway is that TriLink products play a contributing role in elucidating and applying this new world-view of halobionts.

As usual, your comments are welcomed.

Impossible Foods and Other Achievements of Pat O. Brown

  • Brown’s Microarray Publications Started a Revolution in DNA/RNA Analysis
  • Open Access Publishing Was an Unintended Consequence of His Microarray Research
  • Brown’s Passion for Bettering Earth Led to Invention of the Plant-Based Impossible Burger

Impossible Foods is a company founded by Patrick “Pat” O. Brown that wants to transform the global food system by inventing foods we love, without compromise. It’s first commercial product uses 0% meat and 100% plants to recreate everything—i.e. sights, sounds, aromas, textures and flavors—of a big, juicy burger, aptly named the Impossible Burger. “Impossible” because this was not thought to be doable, as many “veggie” burgers have fallen short on appearance, texture and—importantly—taste.

But before I tell you more about the circumstances and science of this game changer of a burger by Brown and company, let me start with his background as related to nucleic acid research, specifically microarrays.

Pat O. Brown and Microarrays

Pat O. Brown. Taken from Wikipedia.org

Brown received his BS, MD, and PhD degrees all from the University of Chicago, where he worked and published with Nicholas R, Cozzarelli on topoisomerases, which are enzymes that participate in the overwinding or underwinding of DNA. Brown did his postdoctoral research with uber-famous Nobel Laureates J. Michael Bishop and Harold Varmus at the University of California, San Francisco.

Brown went on to become a professor at Stanford University and in 1995 was the first to report (along with his colleagues) the use of microarrays for high-throughput analysis of nucleic acid. This seminal article published in venerable Science magazine and titled Quantitative monitoring of gene expression patterns with a complementary DNA microarray has now been cited more than 11,000 times in Google Scholar.

This publication described general methods for attaching cDNA probes for genes of interest onto glass microscope slides using a high-speed arraying machine (aka robotic printing). These were then hybridized to fluorescently labeled cDNA derived from mRNA by reverse transcription with dNTPs including labeled dCTP akin to dye labeled dNTPs offered by TriLink for such applications. Slides were then fluorescently scanned to obtain “spots” having pseudo-color intensities for quantitation relative to a “spike in” reference gene, as shown below.

Taken from Brown & coworkers Science (1995).

This paper triggered the genesis of what would become a highly competitive microarray industry, which I think of as going from “seeing spots to seeing dollars.” Interested readers can find much information about this in a review by pioneering experts during that time. A brief synopsis of this commercialization involving Brown is as follows.

From Microarrays to Open Access

Taken from plos.org

During his time at Stanford, Brown and his coworkers were using microarrays to generate huge amounts of data on gene expression profiling that required detailed analysis of even larger amounts of information previously published in many different journals. Although many of these journals were available online via a subscription, others were not, and almost all strictly forbade downloading and automated analysis. This thwarted Brown and others from compiling databases for anyone to use as needed. In other words, it prevented Open Access—allowing everyone, everywhere to have unrestricted, free access to this information.

Brown mulled over various ways for researchers to share their data, and in a coffee shop discussion with Harold Varmus, who was then Director of the NIH, they agreed on the possibility of a NIH-hosted computer server where scientists could post their work, and where it would be organized in a systematic way. Shortly thereafter in 1999, Varmus posted on the Director’s website a draft proposal for something that was dubbed e-Biomed.

In 2001, Brown helped lead the Public Library of Science (PLOS) initiative to make published scientific research open access and freely available to researchers in the scientific community. PLOS quickly grew in popularity, as have other Open Access journals, and PLOS now publishes roughly 20,000 papers per year. TriLink researchers—including yours truly—are pleased to be PLOS authors in a November 2016 report titled Small RNA Library Preparation Method for Next-Generation Sequencing Using Chemical Modifications to Prevent Adapter Dimer Formation, which has been viewed more than 2,700 times as of June 2017.

Impossible Foods

Taken from ajitvadakayil.blogspot.com

Apparently pioneering Open Access wasn’t enough for Brown and in 2009 he decided to devote his sabbatical to a daunting—if not impossible—challenge: eliminating conventional meat production from animals, which he estimated to be the world’s largest environmental problem, according to a reported interview. Reducing meat consumption, Brown reasoned, would free up vast amounts of land and water, as well as mitigate climate change due to methane emitted by animals (specifically, 8% of the world’s water and 15% of greenhouse gas emissions, according to one report). In addition, there would be elimination of enormous quantities of chemical fertilizers that are harmful to water systems.

Taken from impossiblefoods

‘All you have to do is make a product that the current consumers of meat and dairy prefer to what they’re getting now,’ Brown said and succeeded in raising $3 million in venture capital seed money. His startup company—aptly named Impossible Foods—then raised $108 million, in a whopper of a deal (pun intended) for development of its initial product—a plant-based burger, also aptly named the Impossible Burger.

The Impossible Burger is made from all-natural ingredients such as wheat, coconut oil and potatoes. What makes this burger unlike all other veggie burgers is an ingredient called heme. Heme is commonly associated with hemoglobin—the red pigment in blood—but is also found in other hemoproteins, including those in plants, albeit in low abundance compared to red meat. Therein lies the part of this story I decided to research: how does Impossible Foods obtain large amounts of plant heme having the desired properties for their burger?

Leghemoglobin taken from web.mst.edu

In an Impossible Foods patent by Brown et al., I found soy leghemoglobin identified as one such exemplary heme, which is pictured right. Plant cells within the nodule produce leghemoglobin to serve as an oxygen carrier to the bacteria within the nodule, similar to hemoglobin in blood. This enables the bacteria to obtain enough oxygen for respiration but ensures that the oxygen is in a bound form so that it cannot harm nitrogen fixing enzymes inside the bacteria. Cutting open a nodule reveals the red color typical of leghemoglobin when it binds oxygen, as seen below.

Impossible Foods biomanufacturing facility. Taken from psmag.com

According to the patent, biosynthetic leghemoglobin was expressed and purified using recombinant DNA technology for protein production, and then shown by SDS-PAGE gel and mass spectrometry to be identical to soybean leghemoglobin isoforms purified from soybean root nodules. Given the advanced state-of-the-art of industrial-scale recombinant production, I assume the Impossible Foods processes pictured right can be scaled-up to reduce cost.

If you think that Brown’s burger probably falls short of what you’d want for a meat substitute, think again. After watching an unbiased—I assume—and rather entertaining video of numerous taste testers (including meat eaters and a life-long vegan) give it positive reviews, I set out to sample an Impossible Burger. It was served as two sliders topped with sun-dried tomatoes, cavolo nero, vegan sun-dried tomato mayonnaise on a poppy seed bun served with chickpea panelle. I found the taste and texture nicely meat-like, but the $16 cost a bit tough to swallow—pun intended.

Meat-Substitutes are Ethically Compelling and Becoming Big Business

Readers interested in the ethically compelling case for developing meat substitutes like the Impossible Burger may be interested in a newspaper story about a for-credit curriculum now offered by the University of California, Berkeley. In that article, I was particularly impressed by the extent of other investments in commercialization of meat-substitutes:

  • In direct competition with Impossible Foods, which has raised a total of $183 million, Beyond Meat (which counts both multibillionaire Bill Gates and meats giant Tyson Foods as investors), sells The Beyond Burger™ as well as other meatless products.
  • In 2014, Pinnacle Foods (Vlasic® pickles, Birds Eye® vegetables) bought meatless food producer Gardein for $154 million.
  • Last year, Monde Nissin (instant noodles, etc.) of the Philippines purchased US-based Quorn for $831 million. Quorn’s fungus-derived mycoprotein can be processed to look and taste like chicken nuggets, sausage or patties.
  • Some startups such as Mosa Meat in Holland and Perfect Day in Berkeley are pushing the genetic engineering toward completely biosynthetic “meat” and “milk,” respectively, as recently reported in The Economist.

In conclusion, I think you’ll agree with me that the aforementioned accomplishments of Pat O. Brown give him good reasons to be smiling so broadly in his picture above. I certainly would be.

As usual, your comments are welcomed.

Postscript

After writing this blog, I read about Memphis Meats in San Leandro, California, which—like Mosa Meat and Perfect Day—has been developing cell culture-based technology to produce “meat.” Focusing on chicken, the company is quoted as saying that ‘the taste and texture is similar to that of the real thing, just a bit spongier.’ While this seems promising, it currently costs around $9,000 to produce a pound of Memphis Meats’ poultry, compared to a bit over $3 for a pound of chicken breast. However, the company hopes to reduce costs drastically and to launch a commercial product in 2021. I hope it does, but I think it won’t.

SaveSave

SaveSaveSaveSave

SaveSave

SaveSave

SaveSave

SaveSave

Curiously Circular RNA (circRNA) Gets Curiouser

  • circRNA Molecules Have, Oddly, No Beginning or End
  • circRNA Are Now Recognized as Regulators of Gene Expression 
  • A Flurry of New Findings Indicate circRNA Are Also Templates for Synthesis of Proteins Having As Yet Unknown Functions

Electron micrograph of ~3,000-nt circRNA. Taken from Matsumoto et al. PNAS (1990).

About a year ago, my blog titled Curiously Circular RNA pointed out that circular RNA (circRNA) in animals are odd molecules in that, unlike the vast majority of other RNA in animals, circRNA have no structural beginning (5’) or end (3’). This very curious feature has, not surprisingly, stimulated considerable scientific interest in knowing more about these molecules, which were serendipitously discovered some 30 years ago.

Application of next-generation sequencing has revealed that circRNA are actually relatively abundant and evolutionarily conserved, which implicates biological importance rather than inconsequential mistakes during RNA splicing mechanisms. Some circRNA have been shown to have function—circRNA can hybridize to complementary microRNA (miRNA), and thus serve as a kind of ‘sponge’ that influences miRNA-based gene expression. Evidence for circRNA involvement in gene expression continues to grow, as there are now >700 items on “circRNA [and] sponges” in Google Scholar.

Very recently published lines of research (that I’ll outline in what follows) implicate circRNA as coding templates for proteins, which heretofore has been exclusively associated with messenger RNA (mRNA). Current dogma holds that translation of mRNA into protein requires recognition of the 7-methylguanylated (m7G) 5’-cap structure to start ribosome binding, while the 3’-poly(A) tail protects the mRNA molecule from enzymatic degradation and aids in stopping translation, as depicted below.

Taken from Shoemaker & Green Nature Structural & Molecular Biology (2012).

Start and stop structural elements characteristic of mRNA are obviously not present in circRNA, which are literally just circles of RNA. Consequently, finding proteins encoded by circRNA has stirred up controversy about whether such proteins are a new and fundamentally important aspect of genetics or just inconsequential biochemical mistakes.

Translation of circRNA in Fly Head Neurons

Fruit fly. Taken from turbosquid.com

Researchers at The Hebrew University of Jerusalem in Israel in collaboration with a team at Max-Delbruck-Center for Molecular Medicine in Berlin, Germany recently reported in Molecular Cell the first compelling evidence that a subset of circRNA is translated in vivo. The study by Kadener & coworkers was carried out using the common fruit fly (Drosophila melanogaster), which is known to have a number of features that lend to investigations of circRNA: (1) >2,500 fruit fly circular RNAs have been rigorously annotated, (2) these are mostly derive from back-splicing (pictured below) of protein-coding genes, (3) hundreds of which are conserved across multiple Drosophila species, and (4) exhibit commonalities to mammalian circRNA.

Direct back-splicing: a branch point in the 5’ intron attacks the splice donor of the 3’ intron. The 3’ splice donor then completes the back-splice by attacking the 5’ splice acceptor forming a circRNA. Taken from Jeck & Sharpless Nature Biotechnol (2014).

This study by Kadener & coworkers involves a plethora of technically complex experimental procedures and associated jargon, from which I’ve extracted what I believe to be some key points to share. After annotating the Drosophila circRNA open reading frames (cORFs), which, by definition,h have the potential for translation, they searched for evidence of their translation utilizing previously published ribosome footprinting (RFP). This led to identification of 37 circRNAs with at least one specific RFP read, referred to as ribo-circRNAs.

Taken from Jeck & Sharpless Nature Biotechnology (2014)

Several representative ribo-circRNAs were then constructed to each have (pictured below) a metallothionine (MT) promoter and V5 tag to facilitate translation and anti-V5 antibody-based detection of the expected protein after transfection into cells.

To determine whether circRNAs are translated in a more relevant tissue, they set up the RFP methodology in fly heads. A genetic locus named mbl that is known to produce a circRNA (circMbl3) at high abundance was selected for targeted mass spectrometry from a fly head immunoprecipitated MBL. They utilized synthetic peptides to determine characteristic spectra for which to search in the fly head immunoprecipitate and found a consistent and very high confidence hit for a peptide that can only be produced by circMbl3.

Kadener & coworkers extended these fly head findings to mammalian mouse and rat systems, but the most interesting part of this study—in my opinion—dealt with what signals ribosome binding and translation in the absence of the 5’ cap structure present in mRNA. They demonstrated circRNA translation under conditions intended to block normal 5’ cap-dependent translation of mRNA, and concluded that “[untranslated regions] of ribo-circRNAs (cUTRs) allow cap-independent translation [and that] further research is necessary to uncover how these sequences promote translation.”

Remarkably, as you’ll now read, another group of investigators have apparently found how such promotion of circRNA translation can occur.

Translation of circRNA is Driven by N6-Methyladenosine (m6A)

The most abundant modification of RNA in eukaryotes is m6A, which has been recently shown by Li et al. to recruit binding proteins that collectively facilitate the translation of specifically targeted mRNAs—i.e. those “marked” with m6A—through interactions with 40S and 60S ribosome subunit “machinery” that actually carry out translation. Contemporaneously, Yang et al. found that m6A likewise promotes efficient initiation of protein translation from circRNAs in human cells. They discovered that consensus m6A motifs are enriched in circRNAs, and a single m6A site is sufficient to drive translation initiation.

As depicted below, this m6A-driven translation requires initiation factor F4G2 and m6A “reader” YTHDF3. Experiments showed that this translation is enhanced by methyltransferase METTL3/14 and inhibited by demethylase FTO, which enzymatically “add” and “subtract” methyl (Me) groups on specific adenosines (A) in circRNAs, respectively.  It has also been shown to be upregulated upon heat shock, which is a commonly employed method to induce “stress” in cells.

Taken from Yang et al.

Further analyses through polysome profiling, computational prediction and mass spectrometry revealed that m6A-driven translation of circRNAs is widespread, with hundreds of endogenous circRNAs having translation potential. Yang et al. concluded by stating that their “study expands the coding landscape of [the] human transcriptome, and suggests a role of circRNA-derived proteins in cellular responses to environmental stress.”

Zinc Finger Protein in Muscle Cell Development

Finally, and essentially contemporaneously with above mentioned two publications, a third independent investigation reported by Legnini et al. demonstrated selective circRNA downregulation using short-interfering RNAs (siRNAs). These reagents for RNA interference (RNAi) were used in an image-based functional genetic screen of 25 circRNA species, conserved between mouse and human, expression of which are differentially expressed during myogenesis (i.e. formation of muscular tissue) in Duchenne muscular dystrophy myoblasts.

This siRNA/RNAi-based functional analysis provided one interesting case related to zinc finger protein 609 (circ-ZNF609)—a reported miRNA sponge—the phenotype of which could be specifically attributed to the circular form and not to the linear mRNA counterpart. Consistent with the circ-ZNF609 sequence having an ORF, they found that a fraction of circ-ZNF609 RNA is loaded onto polysomes and that, upon puromycin treatment, it shifted to lighter fractions, similar to mRNAs. The coding ability of this circRNA was proved through use of artificial constructs expressing circular tagged transcripts, and by CRISPR/Cas9—the trendy gene editing method about which I’ve already commented multiple times.

Despite all this evidence, Legnini et al. stated that they “have no hints on the molecular activity of the proteins derived from circ-ZNF609 and as to whether they contribute to modulate or control the activity of the counterpart deriving from the linear mRNA.”

In thinking about closing comments about this update in circRNA, I decided to emphasize that investigations in the field of RNA continue to reveal complexities that will require many more years of global attention to unravel and understand. In just the past decade or so we’ve learned about gene regulation by miRNA/siRNA, reclassification of “junk DNA” as encoding a myriad of long noncoding RNA (lncRNA), mRNA regulation by base-modifications, and curious circRNAs that are more than sponges, and likely encode hundreds (if not thousands) of proteins whose functions have yet to be elucidated. Amazing!

What are your thoughts about all of this?

Your comments are welcomed.

Postscript

After writing this blog, Panda et al. at the National Institute on Aging-Intramural Research Program, National Institutes of Health published a paper titled High-purity circular RNA isolation method (RPAD) reveals vast collection of intronic circRNAs. Here’s a snippet of the abstract which adds to the increasingly curious occurrence of circRNAs that begs, if you will, further research aimed at discovering functions of circRNA-derived proteins.

“Here, we describe a novel method for the isolation of highly pure circRNA populations involving RNase R treatment followed by Polyadenylation and poly(A)+ RNA Depletion (RPAD), which removes linear RNA to near completion. High-throughput sequencing of RNA prepared using RPAD from human cervical carcinoma HeLa cells and mouse C2C12 myoblasts led to two surprising discoveries: (i) many exonic circRNA (EcircRNA) isoforms share an identical backsplice sequence but have different body sizes and sequences, and (ii) thousands of novel intronic circular RNAs (IcircRNAs) are expressed in cells. In sum, isolating high-purity circRNAs using the RPAD method can enable quantitative and qualitative analyses of circRNA types and sequence composition, paving the way for the elucidation of circRNA functions.”

DNA Day 2017

  • There are Now Millions of DNA-Related Publications
  • Some of the Top 5 Cited Papers on DNA Will Surprise You
  • You Probably Won’t Guess Top 5 Most Frequently Cited

Deciding what to post here in recognition of DNA Day 2017 was just as challenging as it has been in past years, primarily because there’s so many different perspectives from which to choose. After much mulling, and several abandoned approaches, I settled on featuring DNA publications that have received the most citations, as an objective metric—not just my subjective opinions about topics I think are significant or otherwise interesting.

Before getting to the numbers of DNA-related papers and some of the most cited papers, here’s a quick recap of what was posted here in the past, starting with the inaugural blog four years ago:

2013—60th Anniversary of the Discovery of DNA’s Double Helix Structure

2014—My Top 3 “Likes” for DNA Day

2015—Celebrating Click Chemistry in Honor of DNA Day

2016—DNA Dreams Do Come True!

Explosive Growth of DNA Publications

Regular readers of my blogs will know that I frequently use the NIH PubMed database of scientific articles to find publications by searching keywords, phrases, or authors. A convenient feature of these searches is providing “results per year” that can be exported into Excel for various purposes. Some preliminary searches indicated that DNA-related articles can be indexed by either DNA or PCR, or cloning, or other terms among which sequencing was notable. The majority, however, were indexed as either DNA or PCR, which together gave nearly 1.7 million items—an astounding number. This number is even much greater since PubMed excludes some important chemistry journals, as well as patents.

Diving deeper into these numbers, I thought it helpful to look at the publication volumes and rates for DNA, sequencing DNA, and PCR through 2015 starting from 1953, 1977, and 1986, respectively. These respective dates correspond to seminar publications by Watson & Crick, Maxam & Gilbert, and Mullis & coworkers. The results shown in the following graph attest to my often stated “power of PCR” as premier method in nucleic acid research, which we’ll see again below in another numerical context.

Top 5 Cited Papers

During my perusal of the above literature in PubMed generally related to DNA, I thought it would be interesting to find, and share here, which specific papers have the distinction of being most frequently cited. Citations are not available in PubMed, but are compiled in Google Scholar, which led me to these Top 5 that are listed from first to fifth.

Frederick Sanger (1918-2013) Taken from newscientist.com

  1. DNA sequencing with chain-terminating inhibitors

Frederick Sanger, the eponymous father of the “Sanger sequencing” method published in 1977, received the 1980 Nobel Prize in chemistry for this contribution. He also received the 1958 Nobel Prize in chemistry for sequencing insulin, and is the only person to win two Nobel Prizes in chemistry. Uber-famous DNA expert Craig Venter is quoted as saying that ‘Fred Sanger was one of the most important scientists of the 20th century,’ [who] ‘twice changed the direction of the scientific world.’

  1. Analysis of relative gene expression data using real-time quantitative PCR and the 2− ΔΔCT method

Kenneth J. Livak, PhD
Taken from archive.sciencewatch.com

The most commonly used method to analyze data from real-time, quantitative PCR (RT-qPCR) experiments is relative quantification, which relates the PCR signal of the transcript of interest to that of a control sample such as an

untreated control. The derivation, assumptions, and applications of this method were published in 2001 by Livak & Schmittgen. I overlapped with Ken Livak at Applied Biosystems, which pioneered commercilaization of RT-qPCR reagents and instrumentation at the time. He is currently Senior Scientific Fellow at Fluidigm Corp.

Sir Edwin M. Southern Taken from ogt.co.uk

3. Detection of specific sequences among DNA fragments separated by gel electrophoresis

Sir Edwin Mellor Southern, FRS, the eponymous father of “Southern blotting” DNA fragments from agarose gels to cellulose nitrate filters published in 1975, is a Lasker Award-winning molecular biologist, Emeritus Professor of Biochemistry at the University of Oxford and a fellow of Trinity College. He is also Founder and Chief Scientific Advisor of Oxford Gene Technology.

  1. Prof. Bert Vogelstein, MD
    Taken from hhmi.org

    A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity

This paper by Feinberg & Vogelstein published in 1983 describes how to conveniently radiolabel DNA restriction endonuclease fragments to high specific activity using the large fragment of DNA polymerase I and random oligonucleotides as primers. These “oligolabeled” DNA fragments serve as efficient probes in filter hybridization experiments. His group pioneered the idea that somatic mutations represent uniquely specific biomarkers for cancer patients, leading to the first FDA-approved DNA mutation-based screening tests, and now “liquid biopsies” that evaluate blood samples to obtain information about underlying tumors and their responses to therapy (an area that I’ve touted in previous blogs). A technique for conveniently radiolabeling DNA restriction endonuclease fragments to high specific activity is described. DNA fragments are purified from agarose gels directly by ethanol precipitation and are then denatured and labeled with the large fragment of DNA polymerase I, using random oligonucleotides as primers. Over 70% of the precursor triphosphate is routinely incorporated into complementary DNA, and specific activities of over 109 dpm/μg of DNA can be obtained using relatively small amounts of precursor. These “oligolabeled” DNA fragments serve as efficient probes in filter hybridization experiments.

  1. Kary B. Mullis, PHD
    Taken from TED.com

    Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase

In 1988, Kary B. Mullis and coworkers (then at Cetus Corp.) published in venerable Science a method using oligonucleotide primers and thermostable DNA polymerase from Thermus aquaticus to amplify genomic DNA segments up to 2000 base pairs to detect a target DNA molecule present only once in a sample of 105 cells. Since that time, polymerase chain reaction (PCR)-related technology has evolved to now routinely enable a variety of single-cell analyses of DNA or RNA. Dr. Mullis received the 1993 Nobel Prize in chemistry for his 1983 invention of PCR, which his website says ‘is hailed as one of the monumental scientific techniques of the twentieth century.’

Top 5 Papers by Citation Frequency

While writing the above section, it occurred to me that ranking these five publications by total number of citations-to-date in Google Scholar doesn’t account for differences in the number of years between the year of publication and now. I did the math to calculate the average citation frequency per year, and here’s the totally surprising—to me—result: relative gene expression methodology published by Livak & Schmittgen is by far the most frequently cited of the Top 5, according to this way of ranking:

  1. 2001, relative gene expression, Cited by 69560 = 4,637 avg. citations per year
  2. 1977, Sanger sequencing, Cited by 32662 = 1,701
  3. 1975, Southern blotting, Cited by 21201 = 796
  4. 1988, PCR, Cited by 18785 = 671
  5. 1983, oligolabeled DNA, Cited by 21200 = 642

I should point out that, as transformative methods such as these gradually become widely recognized as “standard procedures,” researchers tend to feel it unnecessary to include a reference to the orignal publication. Consequenly, citation frequency decreases with time even though cummulative usage increases. In other words, 25 years from now average citations per year for relative gene expression will have likely decreased, and be surpassed by a new “method of the decade,” so to speak.

Prediction for the Future

This line of reasoning leads me to close with some speculation about what DNA-related technique might emerge as the next “method of the decade” that tops the above ranking by citation frequency.

My guess is that it will be Multiplex genome engineering using CRISPR/Cas systems by Zhang & coworkers that has been cited by 4145 at the time I’m writing this piece, only four years from its publication in venerable Science in 2013. Some of my blogs have already commented on various aspects of CRISPR/Cas9, which is among genome editing tools offered by TriLink.

As usual, your comments are welcomed.

Autism Awareness Month – April 2017

  • Sequencing for Diagnosis of Autism Holds Promise
  • Several Genetic-Risk Testing Procedures are Available
  • More Than 40 Autism Publications Using TriLink Products

The first National Autism Awareness Month was declared by the Autism Society in April 1970 with the aim of educating the public about autism. Autism is a complex mental condition and developmental disability, characterized by difficulties in the way a person communicates and interacts with other people. Autism can be present from birth or form during early childhood, typically within the first three years. Autism is a lifelong developmental disability with no single known cause.

The puzzle pattern of this ribbon reflects the complexity of autism, while the colors and shapes represent the diversity of people and families living with this spectrum of disorders. Taken from drdiane.com

People with autism are classed as having Autism Spectrum Disorder (ASD) and the terms autism and ASD are often used interchangeably. The term “spectrum” refers to the wide range of symptoms, skills, and levels of disability in functioning that can occur in people with ASD, which includes Asperger syndrome. Some children and adults with ASD are fully able to perform all activities of daily living while others require substantial support to perform basic activities. ASD occurs in every racial and ethnic group, and across all socioeconomic levels. However, boys are significantly more likely to develop ASD than girls. The latest analysis from the U. S. Centers for Disease Control and Prevention (CDC) estimates that 1 in 68 children has ASD.

Taken from myaspergerschild.com

According to the CDC, diagnosing ASD can be difficult, since there is no medical test, like a blood test, to diagnose the disorders. Doctors look at the child’s behavior and development to make a diagnosis.” More details from the CDC are provided at this link.

Notwithstanding this current difficulty for diagnosis of ASD, research has led to continuing progress toward possible blood tests for ASD, which is the focus of this blog and supplements an earlier posting here on treating autism with a broccoli nutraceutical.

ASD and Exome Sequencing

My Google Scholar search for “autism and sequencing” led to a mindboggling list of more than 47,000 items! When ordered by relevance rather than date of publication, two publications were each cited ~1,000-times following “back-to-back” appearance in venerable Nature magazine in 2012. This computes to a combined average of ~400 citations per year, or a fraction more than one citation per day on average, which to me signals significant attention by the ASD research community and thus worth commenting on herein.

Sanders et al., in the first of these two widely cited studies, carried out exome sequencing in 238 families wherein each pair of parents was unaffected by ASD but had a child who was affected (aka proband), and in 200 of these families there was an unaffected sibling. This study design feature is important in view of the widely held idea that complex personality traits are derived by a combination of “nature and nurture,” i.e. genetics inherited from parents and that which is learned or otherwise acquired by familial and all external events.

Before synopsizing what was found, I should note that germline single-base mutations spontaneously arise during mitosis in every generation, and are termed de novo single nucleotide variants (SNVs). Identifying SNVs remained refractory to analysis at the whole genome or exon level until the advent of next-generation sequencing (NGS) technologies.

Sanders et al. found that the total number of non-synonymous (i.e. changes in the amino acid sequence of proteins) de novo SNVs—particularly highly disruptive nonsense and splice-site de novo mutations—are associated with ASD. They concluded that their results “substantially clarify the genomic architecture of ASD, demonstrate significant association of three genes—SCN2A, KATNAL2 and CHD8—and predict that approximately 25–50 additional ASD-risk genes will be identified as sequencing [more] families is completed.”

Neale et al., in the second widely cited study, likewise conducted exome sequencing but on only 175 ASD probands and their parents. Nevertheless, they found that the proteins encoded by genes that harbored de novo non-synonymous or nonsense mutations showed a higher degree of connectivity among themselves and with previous ASD genes as indexed by protein-protein interaction screens. They concluded that their results “support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5- to 20-fold,” but did acknowledge the strong evidence reported by Sanders et al. for individual genes as risk factors.

ASD Genetic-Risk Testing

The American Academy of Pediatrics (AAP) in 2013 issued a statement on ethical and policy issues for genetic screening of children for ASD that was prompted in part due to then recent progress by IntegraGen—a small French genomics company—on development of a gene test that uses a cheek swab to screen infants and toddlers for 65 genetic markers associated with autism. Highlights of the AAP’s statement include:

  • Genetic screening can be particularly useful for diagnosing older babies and children with developmental disorders such as autism.
  • Genetic screening should be made available for all newborns. However, parents should have the right to refuse screening after being informed of the benefits and risks.
  • The decision to offer testing or screening should be based primarily on the best interest of the child.

Taken from autismspeaks.org

By way of an update, I’m pleased to add that in a 2015 press release by IntegraGen it was announced that its ARISk® Test became the first test marketed in the U. S. to assess the risk of autism spectrum disorder in children. Among the following IntegraGen statements about the ARISk® Test, I think it’s most important to note the caveats I’ve bolded for emphasis:

  • The test does not confirm or rule out a diagnosis of ASD for the child tested.
  • The test is intended to be used together with a clinical evaluation and other developmental screening tools.
  • Intended for children with early signs of developmental delay or ASD and in children who have older siblings previously diagnosed with an autism spectrum disorder.
  • A genetic score, based on the total number of genetic markers associated with autism identified, is used to estimate the child’s risk of developing ASD.
  • Intended for use for children 48 months and younger. The ARISk® Test is not available for prenatal testing.

Taken from integragen.com

More recently, Courtagen—cofounded by my former Life Technologies colleague Kevin McKernan (coinventor of SOLiD® NGS)—has commercialized its sequencing analyses for ASD and other neurodevelopmental conditions. According to a Courtagen posting, “[i]n the absence of a known single-gene disorder, ASD likely involves a complex combination of both genetic and environmental factors that influence early brain development. Multi-gene panels, such as Courtagen’s devSEEK® panels, provide clinicians with information on a number of genes commonly associated with ASD and autistic features. Clinicians can then use information from multi-gene panels to tailor treatments that meet the patient’s unique genotype and symptoms.”

Some interesting—to me—logistical and operational information about devSEEK® (237 genes) is as follows:

  • Turn-around time for results is 4-6 weeks.
  • DNA for sequencing is extracted from a single saliva sample. No blood draw or muscle biopsy required; however, blood and muscle tissue are accepted.
  • Courtagen works with patients, physicians, and insurance carriers to pre-approve each test. Courtagen will bill the insurance company and is willing to handle an appeal process as needed.
  • A secure physician online portal is available for ordering genetic tests and accessing patient reports when completed. Genetic counselors are available to address questions regarding Courtagen test results.

ASD Research and TriLink

While mulling over how to conclude this Autism Awareness Month blog featuring genetic testing for ASD, I wondered about TriLink’s role in advancing autism research by virtue of its various nucleic acid-related products being used for autism investigations. I was pleased and proud to find more than 40 items by searching Google Scholar for articles with the words “autism and TriLink.”

Perusal of these items revealed that the most cited (450-times) report was a 2012 publication in highly regarded Cell titled MeCP2 Binds to 5hmC Enriched within Active Genes and Accessible Chromatin in the Nervous System, which used TriLink 5-methyl-2′-deoxycytidine-5′-triphosphate (5m-dCTP). Given the apparent significance of this publication, I won’t try to give a short, simplified synopsis but rather quote the following part of the authors’ summary:

“We report that 5hmC [5-hydroxymethylcytosine] is enriched in active genes and that, surprisingly, strong depletion of 5mC [5-methylcytosine] is observed over these regions. The contribution of these epigenetic marks to gene expression depends critically on cell type. We identify methyl-CpG-binding protein 2 (MeCP2) as the major 5hmC-binding protein in the brain and demonstrate that MeCP2 binds 5hmC- and 5mC-containing DNA with similar high affinities. The Rett-syndrome-causing mutation R133C preferentially inhibits 5hmC binding. These findings support a model in which 5hmC and MeCP2 constitute a cell-specific epigenetic mechanism for regulation of chromatin structure and gene expression.”

I also noted a 2016 Cutting-Edge Review in Arteriosclerosis, Thrombosis, and Vascular Biology titled A CRISPR Path to Engineering New Genetic Mouse Models. These investigators utilized TriLink Cas9 mRNA for gene editing analogous to that reported by others for CRISPR/Cas9-mediated knockout of the autism gene CHD8 (see above). This led to transcriptomic profiling showing that CHD8 regulates multiple genes implicated in ASD pathogenesis and genes associated with brain volume.

In conclusion, I must say that I learned much new information about autism while researching this blog, which I hope you found informative as well as interesting. If so, I have achieved my goal of either increasing or reaffirming your awareness of autism, and the availability of genetic risk-assessment tests.

As usual, your comments here are welcomed.

Postscript

Recently, a team of academic researchers in Arizona made headlines with their publication in Microbiome reporting ties between autism symptoms and the composition and diversity of a person’s gut microbes, aka “gut microbiome,” about which I’ve commented on in several previous blogs.

The participants, who were 18 children with ASD (ages 7–16 years), underwent a 10-week treatment program involving antibiotics, a bowel cleanse, and daily fecal microbial transplants over 8 weeks. Remarkably, the new therapy seemed to provide some long-term benefits, including an 80% improvement of gastrointestinal symptoms associated with ASDs and roughly a 20% – 25% improvement in autism behaviors, including improved social skills and better sleep habits.

Click here for a simplified, educational video on this work by the principal investigator, Prof. James B. Adams at Arizona State University.

I should emphasize that this is a very small study, and much more research will be needed to verify and firmly establish possible benefits and risks. Interested readers should contact Prof. Adams regarding any questions they might have.

Evolving Polymerases to Do the Impossible

  • Polymerases Aren’t What They Used to Be! 
  • Scripps Team Evolves Polymerases That Read and Write With 2’-O-Methyl Ribonucleotides
  • Key Reagents for Romesberg’s “Molecular Moonshots” Are Supplied by TriLink BioTechnologies

Long-time devotees of these posts will likely remember a blog several years ago about Prof. Floyd Romesberg at the Department of Chemistry, The Scripps Research Institute who achieved a seemingly impossible feat. Namely, designing a new pair of complementary bases such that DNA replicating in E. coli would be comprised of six bases, thereby creating a six-base genetic code that is expanded from Nature’s four-base code.

Floyd E. Romesberg. Taken from utsandiego.com

More recently, Romesberg has cleverly outfoxed Nature once again, this time by evolving nucleic acid polymerases into mutant polymerases that can do what heretofore seemed impossible. He and his research team’s publication (Chen et al.) is a tour de force of experimental methodology that is not easily read, and is even harder to simply summarize in a short space like this blog. Consequently, I’ll first tell you what was accomplished, then give a short synopsis of principal new methodology, and close by commenting on the significance of this fascinating work.

Doing the Impossible

Romesberg’s lab successfully achieved what I think of as “multiple molecular moonshots,” wherein a Taq polymerase (which normally reads and writes DNA during PCR), was evolved by novel selection (SELEX) methods into mutant polymerases that are able to transcribe DNA into 2’-O-methyl (2’-OMe) RNA, and reverse transcribe 2’-OMe RNA into DNA for PCR/sequencing.

As depicted below, this was exemplified using a 60-mer DNA template and 18-mer 2’-OMe RNA primer to produce a fully-modified 48-mer 2’-OMe RNA by means of an evolved mPol and all four A, G, C and U 2’-OMe NTPs, which I’m proud to say were bought from TriLink BioTechnologies! This type of molecular evolution of a polymerase has no precedent.

DNA template   5’ ————————————- 3’

RNA primer                                 ←←← 3’ xxxxxx 5’

mPol ↓ 2’-OMe NTPs

Determining the fidelity of this seemingly impossible molecular transformation was addressed by achieving a feat of comparable impossibility! As depicted below, the aforementioned 48-mer 2’-OMe RNA product was hybridized to a DNA primer for reverse transcription into a 48-mer complementary DNA (cDNA) strand, using an evolved mPol, together with all four A, G, C and T unmodified dNTPS, which were also purchased from TriLink. This unprecedented conversion of 2’-OMe RNA into cDNA was followed by conventional PCR/sequencing, the results of which demonstrated relatively high fidelity.

2’-OMe template   5’ xxxxxxxxxxxxxxxxxxxxxxxxxxx 3’

DNA primer                                           ←←← 3’ —— 5’

mPol ↓ dNTPs

cDNA                        3’ ————————————– 5’

How They Did It

In the selection cycle shown below, (1) phage-display libraries were used to expose individual polymerases (Pol) on E. coli. cells in proximity to chemically attached primer/template complexes of interest, which are mixed with natural or modified triphosphates including biotin (green; B)-labelled UTP to extend the primer. (2) Phage that display active mutant polymerases (mPols) are isolated with streptavidin (SA) beads. After washing to remove nonspecific binders, phage cleaved from the beads are used to re-infect E. coli. (3) Heat-treated lysates of E. coli that express the recovered mPols are next subjected to plate-based screening using 96-well plates coated with primer/template complex and extension buffer that contained natural or modified triphosphates and B-UTP, incorporation of which is chromogenically detected. (4) Mutants that give rise to the most activity are selected for individual gel-based analysis, from which (5) promising candidates are selected for further diversification (e.g., by gene shuffling, as depicted) and then subjected to additional rounds of evolution.

Taken from Chen et al. Nature Chemistry (2017)

What is the Significance

In a previous blog, I’ve commented on increasing interest in the utility of aptamers, which are oligonucleotides that can specifically bind small molecules or motifs in proteins, and thus be used to build electronic sensors or studied as potential therapeutic agents rivaling antibodies. Therapeutic aptamers, like antisense oligonucleotides, require incorporation of chemical modifications to impart stability toward nucleases in blood or cellular targets.

Burmeister et al. have previously reported methods for mPol transcription of a DNA template into a fully modified, nuclease-resistant 23-mer 2’-OMe RNA aptamer—also using TriLink’s 2’-OMe NTPs! However, they encountered considerable experimental difficulties in generating this therapeutically promising 23-mer against vascular endothelial growth factor. These technical issues have now been surmounted by the mPol-evolution approaches in the present work by Romesberg’s team, which enabled improved access to longer 2’-OMe RNA aptamers with reasonable efficiency and fidelity.

Moreover, the present study is the first to evolve an mPol for reverse transcription of fully modified 2’-OMe RNA into DNA, which can then be amplified by PCR and/or sequenced, thereby opening the door for a variety of new analytical methods. Most importantly, the molecular mechanism by which these remarkable mPol activities was evolved, namely, the stabilization of an interaction between the “thumb and fingers domains,” may be general and thus useful for the optimization of other Pols. In that case, we can look forward to further advances in evolving other Pols to do the impossible—hopefully using modified nucleotide triphosphates from TriLink!

As usual, your comments are welcome.

CRISPR-Mediated Interference (CRISPRi) of Long Non-Coding RNA (lncRNA)

  • More Methodology from CRISPR Mania
  • lncRNA Function Blocked by CRISPRi
  • Mysteries of lncRNA Can Now be Deciphered by CRISPRi

This blog is about yet another example of a powerful new methodology spawned by intense scientific interest in using CRISPR-related technologies. This near mania for all things CRISPR is reflected by there being ~5,000 (!) publications already in PubMed only ~5 years after seminal papers appeared.

I chose the present blog topic because it involves use of CRISPR for genome-wide identification of functional long non-coding RNA (lncRNA) in human cells. In an earlier blog about lncRNA, which are now recognized to be regulators of gene expression encoded by what was originally defined as “junk” DNA, it was pointed out that it is inherently difficult to experimentally identify such regulation by lncRNA. Thanks to CRISPR this task is now much less daunting as you’ll learn below, following a couple of introductory sections to set the stage.

Repurposing CRISPR/Cas9 Using “Dead” Cas9

Qi et al. very cleverly—at least to me—recognized that the CRISPR/Cas9 system could be repurposed as an RNA-guided platform for sequence-specific control of gene expression by finding a catalytically inactive mutant Cas9 protein that lacked exonuclease (i.e. cutting) activity of wild-type Cas9, and instead blocked transcription by RNA polymerase (RNAP), as depicted below. These researchers coined the overall process as “CRISPR interference” (CRISPRi) and loosely referred to such a mutant Cas9 as “dead” Cas9 (dCas9).

Taken from Qi et al.

Interested readers are encouraged to consult this publication by Qi et al. to fully appreciate the extensive amount of work that went into translating the above concept into practice, and supporting the proposed mechanism of action. In my opinion, it’s a tour de force example of applying hypothesis-driven, state-of-the art molecular biology to devise a new method—in this case specifically blocking transcription of a DNA region using CRISPRi in conjunction with target-specific short guide RNA (sgRNA).

Adding Functionality to Down-Regulate Transcription

Just as organic chemists can design and synthesize small molecules having desired functional properties, molecular biologists can design and produce complex macromolecules having desired functional elements. The latter is nicely exemplified by Gilbert et el., who demonstrated that fusion of dCas9 to transcription factor effector domains having repressive regulatory functions enables efficient transcriptional repression in human (or yeast) cells via sgRNA that target genes of interest.

Taken from Liu et al. (2017)

As depicted below, Gilbert et al. used dCas9 fused to the Krüppel associated box (KRAB) domain, which is a transcriptional repression domain, and Green Fluorescent Protein (GFP) as a reporter gene targeted by sgRNA. They employed RNA-sequencing to quantify the transcriptome of GFP-positive HEK293 cells expressing dCas9-KRAB or a negative control construct. It was shown that CRISPRi is highly specific, as GFP was the only gene that was significantly suppressed by GFP-targeting sgRNA. Averaged data from two independent biological replicates indicated that no gene other than GFP changes by >1.5-fold.

Genome-Scale CRISPRi to Identify Human lncRNA

According to Liu et al., it has not been possible to predict which lncRNA loci are functional or what function they perform. Consequently, there is a need for large-scale, systematic approaches to interrogating the functional contribution of lncRNA loci. This sizeable team of collaborators from various institutions in the San Francisco Bay area, therefore, developed a genome-scale screening platform using CRISPRi with dCas9-KRAB and a library of sgRNA.

Taken from Liu et al. (2017)

As depicted below for the overall approach, they first designed a CRISPRi Non-Coding Library, which targets 16,401 lncRNA genes each with 10 sgRNAs per transcription start site. The required 170,262 sgRNAs were not synthesized chemically, but rather produced intracellularly by first using array-based sgDNA synthesis followed by clonal (i.e. individual sgDNA sequence) incorporation into lentivirus, which in turn were transfected into seven types of cells for screening. More detail on such lentiviral libraries is given in a Footnote at the end of this blog.

As indicated pictorially above, they applied this pooled screening approach to identify lncRNA genes that modify robust cell growth for induced pluripotent stem cells (iPSC) and six well-known, transformed human cell lines (K562, U87, etc.). This led to identification of 499 lncRNA loci that modified cell growth upon CRISPRi targeting.

Interestingly—at least to me—372 (~75%) and 299 (~60%) of these 499 growth-modifying lncRNA loci were distal to a protein coding gene (PCG) or mapped enhancer, respectively. The diagram below, taken from a review by Vance & Ponting, depicts “distal” effects of lncRNA away from PCG between two chromosomes (chr). What “triggers” transcription of the lncRNA from chr A and how it “finds” its cognate PCG on chr B are open and indeed intriguing questions.

Taken from Vance & Ponting (2014)

In addition to these high percentages of distal effects, Liu et al. found the following surprising results with regard to cell-type specificity of lncRNA function:

“Remarkably, 89% of the lncRNA gene hits modified growth in just one of the cell lines tested, and no hits were common to all seven cell lines. Although nearly all of the hit genes were expressed in the cell line in which they exhibited a growth phenotype, expression alone was insufficient to explain the cell type specificity of their function.”

“[Thus,] in contrast to recent studies that found that essential protein-coding genes typically are required across a broad range of cell types, we show that lncRNA function is highly cell type-specific, a finding that has important implications for their involvement in both normal biology and disease.”

Following are some of the major unanswered questions about lncRNA posed in a review I recommend reading for more background on lncRNA:

  • How does the manner in which lncRNAs are transcribed, processed, and regulated differ from that of other RNAs?
  • Are lncRNAs evolutionarily conserved, both in terms of their primary sequences and secondary structures?
  • Are all lncRNAs functional? Which ones have detectable biological functions in cells or in the whole organism?
  • Does the pervasive transcription that generates the lncRNA transcripts play a regulatory role distinct from the steady-state accumulation of the lncRNAs?
  • Can lncRNAs be exploited for clinical applications and therapeutics?

After reading this review, I thought to myself that there are many open questions about lncRNA but no comprehensive answers yet deciphered. When I then checked Google Scholar for items with both “deciphered” and “lncRNA” as terms, I found there were over 1,800 such items. Evidently, there are quite a few authors who, like me, view unknown functions of lncRNA as a cipher. I suspect that much of the now mysterious lncRNA function will eventually be deciphered thanks, in part, to the power of CRISPRi.

Your comments are welcomed.

Footnote

Readers interested in lentiviral sgRNA library construction and use for screening target cells can find general information at this website from which the following self-explanatory schematic provides a high-level overview of the workflow.

Taken from cellecta.com

Nanopore Sequencing by Synthesis (Seq-by-Syn)

  • Yet Another Notable Achievement Involving George Church, ‘The Most Interesting Scientist in the World’ 
  • Team of 30 Coauthors Reports Seq-by-Syn with DNA Polymerase-Nanopore Protein Construct on an Integrated Chip
  • Challenging Improvements Needed for Commercial Reality

Prof. George M. Church. Taken from evolutionnews.org

Devotees of my blog will know that I’m prone to word play such as calling myself a “huge” fan of “tiny” nanopores for DNA sequencing, about which I’ve previously opined. They will also recall that I’m an admitted scientific admirer of George Church, who I think is The Most Interesting Scientist in the World.

Having said this, it’s not surprising that I closely follow what’s trending in nanopore sequencing, and also make an attempt to read all of Church’s papers as they get published because they are almost invariably quite interesting, involve “big ideas,” and in some new way are very educational, at least for me. Following are my comments about a recently published paper on nanopore sequencing in venerable Proceedings of the National Academy of Sciences of the United States of America (aka PNAS) wherein Church is the designated corresponding author.

Backstory

The seminal origins and early history of nanopore sequencing have been recently chronicled and criticized—then clarified—in Nature Biotech in several “To the Editor” items, which collectively provide enlightening insights into who did what when, so to speak. Those of us who are ‘Nanoporati’—a clever term tweeted by Nick Lowman—should definitely read those Nature Biotech items. For now, however, I’ll set the stage, as it were, by echoing a bit of what I’ve posted in the past for nanopores.

Patented but prophetic (i.e. no data) methods for nanopore sequencing DNA is actually a relatively old (~20 year) idea posited by Church and other creative visionaries. On the other hand, nanopore sequencing was first reduced to practice commercially not too long ago by Oxford Nanopore Technologies (ONT). Many years of delay between concept and commercialization was due to the need for gradual evolution of lots of “nanopore-ology” and sequencing biochemistry, as well as developing highly sophisticated electronics and complex algorithms for data analysis.

Nanopore Sequencing-by-Scanning (Seq-by-Scan)

Taken from rsc.org

As depicted below, and as can be best seen in a video, ONT’s commercially available MinION Seq-by-Scan system essentially involves threading a strand of DNA through a protein-based nanopore and converting resultant ionic current fluctuations into nucleotide base sequence.

While there are issues with base-calling accuracy, the remarkably small and readily portable MinION provides fast, real-time sequencing results for a wide variety of applications. These included unique or otherwise compelling Point-of-Care analyses, such as pathogen surveillance, which has been achieved in remote geographical locations and even in outer space aboard the International Space Station, as I’ve previously posted.

Nanopore Seq-by-Syn

In contrast to DNA Seq-by-Scan using a nanopore, which is challenged by pore-based differentiation of similarly sized A, G, C, and T bases, DNA Seq-by-Syn has no such limitation as it uses the DNA as a template for base-by-base (i.e. stepwise) detection of enzymatic synthesis of complementary DNA. Various Seq-by-Syn methods and challenges have been discussed elsewhere, and currently available commercial systems include those from Illumina and PacBio. The former employs nucleotides that are reversible terminators equipped with cleavable fluorescent “tags” on each base. The latter detects fluorescently labeled tags on polyphosphates released upon nucleotide incorporation.

The presently featured DNA Seq-by-Syn publication by Stranges et al., which builds upon two earlier reports cited therein, differs from the above approaches by using nanopore-based detection of mass tags rather than fluorescent tags. In principle, mass tags could afford higher accuracy compared to DNA Seq-by-Scan. However, as will now be explained, achieving improved accuracy is far easier said than done.

The general approach taken to demonstrate proof-of-concept for mass-tagged nanopore DNA Seq-by-Syn is depicted below in simplified cartoon form, but involves a true tour de force—in my opinion—of three key technologies. The first is design and synthesis of the nucleotides with appropriate mass tags, which involves very sophisticated chemistry that is best appreciated by reading detailed, extensive supporting information (SI) for Stranges et al. and SI for an earlier publication by Fuller et al. In a nutshell, these nucleotides have 5’-hexaphosphates linked to relatively large mass tags comprised of complex oligonucleotide structures.

Taken from Stranges et al. PNAS 2016

The second area of technical innovation involves attachment of a single molecule of ϕ29 DNA polymerase to each α-hemolysin (αHL) nanopore in such a manner as to retain its enzyme activity and be positioned such that every released mass tag transits through (i.e., is “captured” by) the nanopore leading to base identification by its current signature. As depicted below in two related representations, each of these heteroheptameric pores is comprised of one modified αHL subunit to which a peptidyl SpyTag moiety is attached, and six unmodified αHL subunits. This allows attachment of one ϕ29 DNA molecule modified with a cognate peptidyl SpyCatcher moiety at a predetermined, time-average distance from the pore.

Taken from Stranges et al. PNAS 2016.

The third key area of innovation deals with insertion of the enzyme-pore conjugate into a lipid bilayer residing on a silanized array (aka chip) of 256 Ag/AgCl electrodes such that there is one functional pore per electrode. Interested readers are encouraged to consult the publication for details, as well as check out related fabrication and methods patents that I found by searching Google Scholar.

Representative Results

The first image shown above depicts what base tag-specific detection would ideally look like if each of the four different bases would have a characteristic current-blockage intensity and persistence. In addition, all pores would ideally function similarly. Not surprisingly, given the stochastic nature of single-molecule systems in general, Stranges et al. found less than ideal behavior.

For example, out of 70 single pores obtained, 25 captured two or more tags, whereas only six of those pores showed detectable captures of all four tagged nucleotides. Data obtained for the pore with the most transitions between tag capture levels (i.e. the best results) is shown below, while results for the other five are given in the SI.

Taken from Stranges et al. PNAS 2016

To quote the authors:

“All four characteristic current levels for the tags and transitions between them can be readily distinguished…Homopolymer sequences in the template, and repeated, high-frequency tag capture events of the same nucleotide in the raw sequencing reads were considered a single base for sequence alignment. We recognized 12 clear sequence transitions in a 20-s period. Out of the 12 base transitions observed in the data, 85% match the template strand, showing that this method can produce results that closely align to the template sequence.” 

Interested readers need to consult and carefully read the SI for Stranges et al. regarding the interpretation of the “repeated, high frequency capture events,” such as that exhibited by C in the above current vs. time plot.

All of the above snippets in aggregate suggest to me that, while this huge amount of work has made progress toward one approach to Seq-by-Syn, many improvements will need to be made before achieving a robust system to successfully compete in the commercial sector.

Authorship, Affiliations, and Acknowledgments

The relatively large team of 30 coauthors listed for Stranges et al. include the following numbers of investigators and affiliations: 1 at Arizona State Univ., 4 at Harvard, 11 at Columbia University, and 14 at Genia Technologies, which is a Santa Clara, CA company that was acquired by Roche in 2014, and is part of Roche Sequencing.

Acknowledgments in Stranges et al. refer to support by Genia and NIH Grant R01 HG007415, which I found was awarded to coauthors George M. Church (Harvard), Jingyue Ju (Columbia), and James J. Russo (Columbia). The end of the abstract of this grant reads as follows:

“The nanopore chips will be enhanced and expanded from the current 260 nanopores to over 125,000 using advanced nanofabrication techniques. We will conduct real-time single molecule Nano-SBS on DNA templates with known sequences to test and optimize the overall system. These research and development efforts will lay the foundation for the production of a commercial single molecule electronic DNA sequencing platform, which will enable routine use of sequencing for medical diagnostics and personalized medicine.”

The conflict of interest statement in Stranges et al. indicates that the technology described therein (called “Nanopore SBS”) has been exclusively licensed by Genia, and that specified coauthors are entitled to royalties through this license. In addition, Church is a member of the Scientific Advisory Board of Genia.

Parting Comments

Long gone are the days when government-funded academic researchers thumbed their noses, if you will, at commercial development. Nowadays almost all academics parlay their government grants into university patents that get licensed to companies, usually with some type of corporate involvement of said academics.

I hasten to add that I’m not implying that NIH-funded academic research being a “seed” for corporate profitability is negative—especially in view of its Small Business Innovative Research (SBIR) program—but rather view it as a paradigm shift for the better, as it allows academic creativity to be harnessed into applications that can hopefully greatly benefit society.

In conclusion, and coming back to George Church, who I highlighted in the introduction to this blog, I must say that he might very well be the academic researcher with the longest list of technology transfer, advisory roles, and founded companies—13 to date—according to a public list that is truly mind boggling, at least to me.

As usual, your comments are welcomed.

Postscript

After writing this blog, Roche announced on December 15, 2016 that “it has officially notified Pacific Bioscience (PacBio) of its intention to terminate its [2013] agreement and efforts to develop a sequencing instrument for use in the clinical research and clinical market using their Single Molecule, Real-Time (SMRT®) technology,” about which I have commented previously. The announcement went on to say Roche would instead focus on internal development efforts” and “actively pursue multiple technologies and commercial strategies.” A GenomeWeb headline was more specific:  “Roche Will Focus on Genia’s Nanopore Technology for Dx Market After Ending Deal With PacBio.”

On December 30, 2016 it was reported that the University of California (UC) filed a patent suit against the Chief Technology Officer (CTO) at Genia, and Genia Technologies, claiming the CTO produced key inventions during his time at UC that he later assigned to Genia, but which should have automatically been assigned to UC. Stay tuned…