Tuesday, April 14, 2015

Are there sequence preferences near the 3' end of the #CRISPR protospacer? Paper from the C. elegans field explores this.

When the first word in a paper title is "Dramatic", I certainly wonder if I will agree after reading it…It's worth a blog post at any rate.   This paper by Farboud, B. and Meyer, B.J. is titled "Dramatic Enhancement of Genome Editing byCRISPR/Cas9 Through Improved Guide RNA Design" (Genetics, Vol. 199, 959–971 April 2015).   

As is true for other model organisms, CRISPR is very useful in nematodes for performing mutagenesis.  In this paper the authors were inspired by the previous observation that the Cas9 protein physically associates with the PAM motif (NGG) sort of promiscuously across DNA.  This had also previously led to the discovery that - in vitro - a CRISPR target region that is generally rich in GG dinucleotides will enable higher rates of cleavage at a unique CRISPR target within that region, than if the region is otherwise reduced in GG content.  In other words, general GG density probably "attracts" Cas9 and keeps more of it around, which in turn may enable faster recognition of the actual target.

Using this idea, the authors tested whether simply keeping an extra GG motif nearby the actual PAM NGG motif would enhance CRISPR mutagenesis in worms.  Turns out that if the PAM is followed (3') by another NGG, it doesn't help.  However, if the first 3 bases 5' to the PAM are NGG - that is, the last 2 bases of the protospacer are GG - they saw a consistent, and yes, dramatic improvement in recovery rates of CRISPR mutants.  This was a pretty striking finding and was validated across about 8 to 10 sites.  For comparison they tested "shifted" targets where they just shifted the protospacer 5' by 3 bases, and used those last 3 bases of the first protospacer as the PAM as the control.   These were almost uniformally poor in terms of absolute numbers of mutagenesis, with numbers usually at zero - meaning with their particular system of worm injections the baseline rate is pretty low.   The targets that had the GG at the protospacer 3' end usually had high mutation rates in double digit percentages.

So is this observed in other animals/cells/systems?   Well, based on my own work and from what I see in the literature, in mice we definitely don't need to have a GG at the protospacer 3' end to get high efficiency mutagenesis.   There are still no consistent rules here, but there are trends for sure.  Cas9 does seem to prefer purines in the last two bases of the protospacer as this seems to enhance gRNA loading (Wang et al 2014).     GC richness across the protospacer is definitely good (Gagnon et al 2014) which must correlate with G's in the protospacer 3' end.  Interestingly, Doench et al 2014 observed a preference for purine in the last base of the protospacer, but not much preference for the penultimate base (see their Fig 3a).

On the other hand I can't identify many examples yet of mammalian CRISPR targets that were published, had a GG in the last bases of the protospacer, had hard mutagenesis rates published and enough other targets in the same paper for a good comparison.  


  1. Hey Doug,

    Across the 1000 plus haploid KOs we've generated, in the most active guides (>75% clones screened contain disruptions) we don't observe a GC bias in the seed region.

    Like you say - there doesn't seem to be consistent rules yet. I wonder if at this stage in the technologies evolution the benefits of base positioning are simply being masked by variability caused by other factors?

  2. That's a great data set. Very interesting that you don't observe a GC bias. Is there a chromatin state bias though? Perhaps that might be related to GC content but only in certain contexts/species.