The University of Berkeley has opened a glimpse into the way bacteria use CRISPR, the microbial immune system that inspired the invention of the method for genetic modification also known as CRISPR. The paper published in Science by Jennifer Doudna’s team is a fascinating piece of basic research and scientists are hopeful they will be able to turn the discovery into a new biotech tool.
The battle between bacteria and viruses is an evolutionary arms race going on since the dawn of life, mutation by mutation. As viruses refine their offense tactic, bacteria upgrade their defense weapons, learning how to better identify foreign DNA and destroy it. Warrior proteins (such as Cas9, on which the CRISPR technology platform is based) gets information by meticulous intelligence work. This is done by a complex called Cas1-Cas2, keeping note of the “most wanted” viruses and inserting their genetic identikits within the bacteria genomic archive in a process called acquisition. We asked Addison Wright, first author of the Science paper, to explain how it works.
Cas 1 and 2 are the most conserved elements of the CRISPR system. Does it mean they play a crucial role?
“Cas1 appears to have evolved from a transposase. The domestication of this protein, changing its activity to only integrate short pieces of DNA at a specific locus rather than moving a whole transposable element to essentially random sites in the genome, seems to have been the key step in the development of CRISPR immunity. This is what allows the systems to adapt to new threats by recording viral sequences. The co-opting of this transposase, with the corresponding change in activity, seems to have been so unlikely that it only occurred once, with everything else splitting off from this original event. Other elements involved in the interference stage (Cas9, Cascade, Cpf1/Cas12a, etc.) are evidently more easily acquired and adapted to usage in CRISPR systems. Some of this is also probably due to the evolutionary arms race with viruses – we’ve discovered many viral strategies to avoid CRISPR systems that block the interference stage, most notably anti-CRISPRs, but so far nothing that clearly blocks acquisition. The need to circumvent anti-CRISPRs and other strategies presumably drives the diversification of interference pathways, while acquisition is under less pressure ”.
What about the target recognition mechanism of Cas1-Cas2? How does the complex bind to the DNA repeats in the bacterial genome to integrate the viral sequences?
“When we solved the structure of Cas1-Cas2 bound to the CRISPR repeat, we were surprised that there didn’t seem to be very much direct readout of the repeat sequence, where amino acids from the protein directly recognize nucleotides in the DNA. Instead, Cas1-Cas2 can only bind and integrate into DNA that can bend and twist in a certain way. The flexibility and helical pitch of DNA are dependent on the sequence, and the CRISPR repeat has a sequence that allows it to bend to fit the structure of Cas1-Cas2, whereas another sequence might be too rigid or have a hinge point in the wrong spot. Basically, the repeat has a hinge point in the middle that allows it to bend over Cas2, which forms a wedge of sorts, while other parts of the sequence allow the repeat to partially unwind. If the DNA maintains a rigid B-form helical pitch, the DNA backbone can’t be bound by both Cas1 active sites and the reaction stalls out half-way through and reverses”.
The Harvard group led by George Church recently used Cas1-Cas2 to integrate a short movie into a bacterial population. Is it going to become a versatile biotech tool?
“Cas1-Cas2 have a potential application as an information storage mechanism, doing things like adding barcode sequences to individual cells to allow us to track them. However, if we want to use the proteins outside their native context, we might want to target them to something other than the CRISPR locus (since human cells, for example, don’t have a CRISPR locus). Discovering how Cas1-Cas2 recognize their target changes how we would go about predicting potential recognition sites as well as engineering the proteins to recognize a new site. Instead of looking for individual nucleotides that are recognized, we want to consider the physical properties of a sequence, and instead of mutating amino acids that read out bases, a better strategy would be to subtly alter the spacing and orientation of the Cas1 active sites through a broader mutagenesis and selection strategy.
CRISPR systems are extensively studied but still hide surprises for scientists. Borrowing from the title of John Maddox’s famous book, what remains to be discovered?
“There is so much still to be discovered! Just focusing on the acquisition side of things, we are only just beginning to understand how protospacers are generated from viral DNA. It seems like this step is almost as variable as the interference step, and relies on different combinations of Cas proteins and unrelated host proteins in different bacteria and archaea. How CRISPR systems selectively acquire foreign DNA, rather than their own DNA (which leads to autoimmunity and cell suicide), is only vaguely understood for some systems and a total mystery for others. There is also the problem that some CRISPR systems target RNA, rather than DNA, but Cas1-Cas2 can only integrate DNA. Some instances of Cas1-reverse transcriptase fusions have been discovered, but we’re still not totally sure how they work, and most of these RNA systems don’t have them, so they must generate DNA protospacers by some other means. To expand beyond acquisition, the field is currently discovering new CRISPR types with new interference proteins faster than we can figure out how they work. C2c2, or Cas13, is a particularly interesting area of study right now. It targets RNA, and once it’s activated by its target sequence it seems to degrade any RNA it sees, whether it’s from the virus or the host. But different groups have seen some somewhat contradictory things in vivo versus in vitro, and no one really knows how it functions in the native context. Type IV systems are a total mystery, since they seem to lack a lot of the elements that we think are fundamental to CRISPR immunity (such as the CRISPR array itself). Two groups just discovered that Type III systems are capable of initiating a secondary messenger signaling cascade, that leads to massive RNA degradation, which is something no one could have predicted a year ago. We have a lot of tricky problems left to deal with in the systems we already know a lot about, and we have no idea how much is left out there for us to discover. I think the CRISPR field is going to be making surprising discoveries for a long time to come”.