- Search of Unusual Microbes Yields New CRISPR-Cas Systems
- Tiny Life Forms Have Smallest Working CRISPR-Cas Systems
- Novel CasX Structure and Mechanism Characterized by Cryo-Electron Microscopy
In 2012, a Science magazine publication by Doudna, Charpentier, and coworkers describedCas9, the CRISPR-associated (Cas) protein, as a programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. This work has already been cited ~6,500 times, alongside two other Cas9 studies listed in PubMed that year. They have been followed by a steadily increasing number of annual Cas9-related publications, as show in the chart below. A large part of this growing interest is due to the proven utility of CRISPR-Cas9, and variants thereof, for gene editing, which I have previously blogged about.
Given the broad scientific, clinical, and commercial utility of CRISPR-Cas systems, it is not surprising that there has been considerable effort directed toward either engineering analogs of known Cas enzymes, or discovering new homologs in unexplored organisms. With regards to the latter approach, Doudna, Banfield and coworkers noted in 2017 that the then available CRISPR-Cas technologies were based solely on systems from isolated, cultured bacteria, leaving the vast majority of enzymes from organisms that have not been cultured untapped.
They added that metagenomics—sequencing DNA extracted directly from natural microbial communities—provides access to the genetic material of a huge array of uncultured organisms. For this reason, through use of metagenomics, the researchers were able to discover two previously unknown CRISPR-Cas systems. These new Cas proteins, named CasX and CasY to designate as yet unknown specifics, are said to be among the most compact systems yet discovered. In February 2019, as a follow-up to these discoveries, Doudna and collaborators published the mechanistic details for CRISPR-CasX in Nature magazine. This will be the focus of this blog, but before that story, here are some introductory comments about metagenomics, a transformative technology in its own right.
Putting Together All of the Pieces
In a review by Chen and Pachter, metagenomics is described as “the application of modern genomics techniques to the study of communities of microbial organisms directly in their natural environments, bypassing the need for isolation and lab cultivation of individual species.” They add that “metagenomics has revolutionized microbiology by shifting focus away from clonal isolates towards the estimated 99% of microbial species that cannot currently be cultivated.”
A typical metagenomics project begins with the construction of a DNA library derived from a minimally processed environmental sample that is usually comprised of multiple different genomes with different copy numbers. The increasing capacity of factory-like sequencing centers has facilitated whole-genome shotgun sequencing and genome assembly of these complex mixtures. At the risk of oversimplification, this to me is conceptually akin to simultaneously putting together correctly all of the pieces of multiple different jigsaw puzzles.
There are many technical variations for these sequencing and bioinformatic procedures, but at a high-level, these can be categorized as either using extracted DNA per se (metagenomics) or cDNA derived from reverse transcription of extracted RNA (metatranscriptomics). Both of these approaches were used in the aforementioned discovery of CasX and CasY, starting with quite unusual sample sources: (1) acid-mine drainage samples (from the Richmond Mine at Iron Mountain in California); (2) river water and sediment samples (from a site along the Colorado River in Colorado); and (3) cold, CO2-driven geyser water (from Crystal Geyser on the Colorado Plateau in Utah pictured here). Presumably, these relatively unusual sample sources increased the discovery-probability, as scientists were able to examine the previously unknown organisms present in each sample.
Discovery of CasX and CasY
Using metagenomics, Doudna, Banfield and coworkers found a number of CRISPR-Cas systems, including what they believed to be the first Cas9 in the in the archaeal domain of life. Archaea constitute a domain of single-celled microorganisms. These microbes are prokaryotes, meaning they have no cell nucleus. Archaeal cells have unique properties that separate them from the other two domains of life, bacteria and eukarya, as depicted here. Archaea are further divided into multiple recognized phyla, but classification is difficult, as most have not been isolated in the laboratory.
This divergent Cas9 protein was found in little studied nanoarchaea, as part of an active CRISPR-Cas system. Incidentally, nanoarchaea are “nano” indeed, only ~400 nm in diameter—about 5% of the volume of your archetypical 1 μm3prokaryote, according to one estimate—andNanoarchaeum equitansharbors a genome that is only 480 kb. Also discovered were two previously unknown Cas proteins unlike all the previous Cas proteins. These were named CasX and CasY, since it was not clear what they actually did. CasX and CasY are among the most compact systems yet discovered, according to these researchers, who concluded that “interrogation of environmental microbial communities combined with in vivo experiments allows us to access an unprecedented diversity of genomes, the content of which will expand the repertoire of microbe-based biotechnologies.”
Cryo-Electron Microscopy (cryo-EM) Characterization of CasX
In February 2019, a follow-up report in Natureby Doudna and collaborators focused on the mechanistic details for CRISPR-CasX. Although RNA-guided DNA binding and cutting proteins have proven to be transformative tools for genome editing across a wide range of cell types and organisms, only two kinds of CRISPR-Cas nucleases—Cas9 (depicted here) and Cas12a (aka Cpf1)—provide the foundation for this revolutionary technology.
The only conserved part of CasX, the RuvC domain, shares less than 16% identity with RuvC domains in either Cas9 or Cas12a. This evolutionary ambiguity in CasX hinted that this enzyme may have a structure and molecular mechanism distinct from that of other CRISPR-Cas enzymes. These structural and mechanistic questions were investigated by use of cryo-EM, a specialized method recently catapulted into widespread view by the co-awarding of the 2017 Nobel Prize in Chemistry to its three pioneers.
As discussed in an introductory YouTube video on cryo-EM, scientists traditionally used X-ray crystallography to obtain biomolecular structures, which requires growing suitable crystals that are oftentimes extremely difficult or not possible to obtain. However, as seen here, freezing a thin layer of a solution of the sample for cryo-EM enables the technique to handle structures for which crystallography is not a viable option. In addition, cryo-EM can visualize much larger structures than crystallography can—100-fold larger according to one cryo-EM expert. By way of example, a 1.8-Å-resolution structure of 334-kDa glutamate dehydrogenase, and 3.6-Å-resolution structure for 11,200-kDa Dengue virus have been reported.
Scientist preparing samples for cryo-EM under liquid nitrogen temperature
Doudna and collaborators took advantage of cryo-EM to obtain eight molecular structures of CasX in different states, which interested readers can view by consulting the 2019 Nature publication (unfortunately, copyright restrictions prevent reproduction here). The researchers’ verbal description of what was found highlights the following structural elements:
“An unanticipated quaternary structure in which the RNA scaffold dominates the architecture and organization of the enzyme. Phylogenetic, biochemical and structural data show that CasX contains domains distinct from—but analogous to—those found in Cas9 and Cas12a, as well as novel RNA and protein folds; thus establishing the CasX enzyme family as the third CRISPR-Cas platform that is effective for genetic manipulation. Finally, distinct conformational states observed for CasX suggest an ordered non-target- and target-strand cleavage mechanism that may explain how CRISPR–Cas enzymes with a single active site, such as Cas12a, achieve double-stranded DNA (dsDNA) cleavage. The small size of CasX (<1,000 amino acids), its DNA cleavage characteristics, and its derivation from non-pathogenic microorganisms offer important advantages over other CRISPR–Cas genome-editing enzymes.”
On the basis of their functional and structural data, Doudna and collaborators propose a model of CasX activation and DNA cleavage that includes the following steps: (1) guide RNA binding-induced CasX structural stabilization and DNA search; (2) non-target-strand binding-assisted DNA unwinding, R-loop formation and nontarget-strand loading into the RuvC active site; (3) RNA-DNA hybrid duplex bending with the aid of the proposed target-strand loading (TSL) domain to position the target DNA strand for cleavage; and (4) product release after the cleavage of both DNA strands.
They added that two distinct target DNA-bound states indicate that CasX coordinates sequential dsDNA cleavage by its single RuvC nuclease, using the zinc-finger-containing TSL domain. Also, the TSL domain appears to confer a convergent mechanism of acute target-strand DNA bending that is central to all type V single-nuclease CRISPR-Cas enzymes.
Looking forward, they speculated that “[t]he compact size, dominant RNA content and minimal trans-cleavage activity of CasX differentiate this enzyme family from Cas9 and Cas12a, and provide opportunities for therapeutic delivery and safety that may offer important advantages relative to existing genome-editing technologies.”
In my opinion, it will likely take some time and considerable experimentation by the scientific community to assess whether any of these potential advantages offered by CasX will actually pan out and lead to widespread adoption. In the meantime, mRNA-encoding Cas9 has firmly established its utility and enjoys extensive adoption, as exemplified by many diverse applications that I found among the search results for “TriLink AND Cas9” in Google Scholar.
As usual, your comments are welcomed.