Hachimoji DNA and RNA: A Genetic System with Eight Building Blocks

  • Researchers Seek to Expand A, G, C, and T Genetic Coding to Additional Nucleobase “Letters”
  • Steven A. Benner Led Expansion from Four to Six Letter-Coding in 2015
  • In 2019, Coding Uses Eight (“Hachi”) Letters (“Moji”)

According to the principles attributed to the early writings of Francis Crick in 1958, The Central Dogma of Molecular Biologystates that H-bonding between A/T and C/G base pairs underlies the storage of genetic information. This information is in “Watson-Crick” DNA for replication, transcribed into RNA, and finally decoded into protein. Exploring the expansion of such H-bonding (shown here) to include synthetic analogs of these four natural nucleobases has been of interest for theoretical and evolutionary reasons, and could have utility for many hybridization-based applications, as well as for storage of information.

Readers interested in an overview these subjects can consult a 2017 review in Acc. Chem. Res.by Richards and Georgiadis, titled Toward an Expanded Genome: Structural and Computational Characterization of an Artificially Expanded Genetic Information System(AEGIS). The pioneering work of Steven A. Benner on expanding the genetic code from four to six building-block letters is reviewed therein. This blog will highlight Benner’s 2019 Sciencepublication (Hoshika et al.) on AEGIS, which reports further expansion to eight building-block letters. This recently expanded system is appropriately named “hachimoji” DNA and RNA:  Hoshika et al. coined this term by combining the Japanese words for eight (“hachi”) and letters (“moji”).

Hachimoji DNA

At the outset, I should point out that there is a YouTube video lecture by Benner that is well worth watching to fully appreciate the rationale behind investigating AEGIS, and the experimental approaches explored.

Benner’s previously published work (Zhang et al.) on the evolution of a functional six nucleotide genetic code included the two new nucleotides, Z and P, which are shown below for DNA; dR is replaced with R in RNA. The H-bonding in these structures is between oppositely positioned donor (red) and acceptor (blue) atoms. These Z and P nucleotides were shown to undergo enzymatic copying, PCR amplification, and successive transcriptions, first into six-letter RNA, and then back into six-letter DNA.

Taken from Hoshika et al. Science363, 884-887. Copyright © 2019, American Association for the Advancement of Science, with permission.

Expansion from six letters to eight letters was investigated using two additional nucleotides, S and B, shown here. The nucleotides Z, P, S and B along with A, G, C and T were each incorporated into 94 different 8-mer sequences of hachimoji DNA oligonucleotides by use of otherwise conventional phosphoramidite chemistry for solid-phase chain-assembly. Duplexes of these GACTZPSB-containing hachimoji 8-mers were then used to measure melting temperature (Tm) values under a set of standard conditions. These experimental Tmvalues were then compared to predicted melting temperatures derived from state-of-the art thermodynamic parameterization of nearest-neighbor base-pair dimers, as described by Hoshika et al.

Plots of experimental versus predicted free-energy change (ΔG°37) (A) and experimental versus predicted melting temperature (Tm) (B) shown here indicate that, on average, Tmis predicted to within 2.1°C for the 94 GACTZPSB hachimoji duplexes, and ΔG°37 is predicted to within 0.39 kcal/mol. These errors were said to be similar to those observed with nearest-neighbor parameters for standard DNA:DNA duplexes, which was interpreted as meaning that “GACTZPSB hachimoji DNA reproduces, in expanded form, the molecular recognition behavior of standard 4-letter DNA. It is an informational system.”

Taken from Hoshika et al. Science363, 884-887. Copyright © 2019, American Association for the Advancement of Science, with permission.

High-resolution crystal structures were determined for three different hachimoji duplexes assembled from three self-complementary 16-mer sequences: 5’-CTTATPBTASZATAAG, 5’-CTTAPCBTASGZTAAG, and 5’-CTTATPPSBZZATAAG. These duplexes were crystallized with Moloney murine leukemia virus reverse transcriptase to give a “host-guest” complex with two protein molecules (host) bound to each end of a 16-mer duplex (guest). With interactions between the host and guest limited to the ends, the intervening 10 base pairs were free to adopt a sequence-dependent structure.

The hachimoji DNA in all three structures adopted a B-form with 10.2 to 10.4 base pairs per turn, similar to natural B-DNA shown here. The major and minor groove widths for hachimoji DNA were similar to one another and to the DNA duplex 5’-CTTATGGGCCCATAAG, but not to the DNA duplex 5’-CTTATAAATTTATAAG.

Despite these and other differences in structure (i.e. propeller and buckle angles), the structural parameters for the individual pairs and the dinucleotide steps of the hachimoji DNA were said to fall well within the ranges observed for natural 4-letter DNA, consistent with hachimoji DNA being a “mutable information storage system” like natural DNA, according to Hoshika et al. I should interject and state that these researchers use the term “mutable” with reference to Schrödinger, who theorized in 1943 that regularity in size was necessary for nucleobase pairs to fit into what he called an “aperiodic crystal,” which he proposed as necessary for reliable molecular information storage and faithful information transfer.

Hachimoji RNA

T7 RNA polymerase bound to DNA and RNA.

With the information storage and mutability properties shown for hachimoji DNA, Hoshika et al. then asked whether hachimoji information DNA could also be transmitted to give hachimoji RNA. To investigate whether native T7 RNA polymerase (pictured here) is capable of transcribing hachimoji DNA, they started with four model sequences that each contained a single nonstandard hachimoji component, B, P, S, or Z, each followed by a single cytidine. To analyze hachimoji RNA products, they labeled transcripts with [α-32P]cytidine 5´-triphosphate; digestion with ribonuclease T2 then generated the corresponding hachimoji 3′-phosphates. These were resolved in thin-layer chromatography (TLC) systems and compared with synthetic authentic nonstandard 3′-phosphates.

These experiments showed that native T7 RNA polymerase incorporates riboZTP opposite template dP, riboPTP opposite template dZ, and riboBTP opposite template dS. However, incorporation of riboSTP opposite template dB was not seen with native RNA polymerase. This observation was attributed to an absence of electron density in the minor groove from the aminopyridone heterocycle on riboSTP. Polymerases are believed to recognize such density, as it is presented by all other triphosphate substrates.

A, G, C, and U 2′-O-Methyl-Nucleotides.

Hoshika et al. therefore searched for T7 RNA polymerase variants able to transcribe a complete set of hachimoji nucleotides. One variant (Y639F H784A P266L, “FAL”) was especially effective at incorporating riboSTP opposite template dB. Interestingly, FAL was originally developed as a thermostable polymerase to accept 2′-O-methyl triphosphates, pictured right.

High-performance liquid chromatography (HPLC) analysis of its transcripts showed that 1.2 ± 0.4 riboSTP nucleotides were incorporated opposite a single template dB. FAL also incorporated the other nonstandard components of the hachimoji system into transcripts.


The findings reported by Hoshika et al. have been lauded by experts, and they represent a significant advance in synthetic biology with the availability of a further expanded, mutable genetic system built from eight different building blocks: four natural (stars) and four synthetic (circles). By continued investigations, additional synthetic building blocks will perhaps lead to further expansion of genetic coding. Intrigued by this possibility, I found that the Japanese word for ten is “juu,” so “juumoji” DNA and RNA might be next.

In any event, with currently increased information density over natural DNA and predictable duplex stability across all 8nsequences of lengthn, Hoshika et al. concluded that hachimoji DNA has potential applications in sequence-based bar-coding and combinatorial tagging, retrievable information storage, and self-assembling nanostructures. I have covered DNA-based information storage and self-assembling DNA nanostructures (aka origami), in some of my previous blogs.

Hoshika et al. also concludedthat structural differences among three different hachimoji duplexes are not larger than the differences between various standard DNA duplexes, making this system potentially able to support molecular evolution. Furthermore, the ability to have structural regularity independent of sequence shows the importance of inter-base H-bonding in such mutable informational systems. Thus, in addition to its technical applications, this work expands the scope of the structures that might be encountered in search for life in the cosmos, which Benner has written about here.

As usual, your comments are welcomed.