Wednesday, January 21, 2015

TIDE: an online tool for evaluating #CRISPR gene editing in sequence trace files.

TIDE is a neat new web tool that's designed for a specific problem that I've definitely been dealing with. Following a CRISPR experiment, either in cell lines or animals, it's not trivial to quantify how well editing/mutagenesis worked and what sort of mutations were generated.   This is well summarized in the introduction of this paper so I won't repeat that, but I have certainly had these situations:  first, staring at ABI chromatograms following sequencing of PCR products from founder mice, and second, trying to quantify cleavage in pools of transfected cells.     Of course, the target site PCRs are going to usually contain mixtures of molecules with different mutations, and likely some amount of wild-type allele (for sure in pooled cells, often in founder animals).   So direct sequencing is hard to interpret as the actual chromatogram data past the cleavage site is usually a jumble of overlapping staggered sequences.

What TIDE does is actually to quantify the underlying non-wild-type sequence signal in the chromatogram data 3' to the expected cleavage site, then it quantifies the apparent contribution of specific, underlying mutant alleles, based on the pretty good assumption that most of the mutations generated by CRISPR will be short indels.  This seems to be a extension of PolyPeakParser, which I blogged about previously, but it's able to deal with multiple mutant alleles.  

I had a recent data set of sequence files from a mouse CRISPR experiment, so I thought I'd compare our independent analysis of the founder mice to TIDE's interpretation.  The gene is anonymized but I can state that it was a straightforward attempt to create indel mutations in a gene of interest.    Here's what we did:   About 25% of pups were positive for new mutations as revealed by Surveyor assays.  We then sequenced PCR products on 8 founder littermates, of which 6 were known Surveyor-positive and 2 were of unknown status.    

The last 2 (#19, #20) mice had normal, wild-type sequencing data.   The other 6 mice had very jumbled sequences past the cleavage site.   After some serious staring at the chromatograms - which took a while - I made some guesses that some of them had specific indel mutations.  However some of them were just too complex for me to figure out.      

Then I analyzed all of them with TIDE, using the sequence file from wild-type mouse #20 as the control file (which TIDE requires).   Here's the results:

Pup #
Pre-TIDE manual interpretation
TIDE result
WT allele and at least 2 different mutant alleles present. Could not interpret mutations at all.
No significant results, but the sequence quality was rather poor to begin with.   
WT allele and a 1-bp deletion allele. Germline transmission confirmed.
66.5% WT, 24.8 % 1-bp deletion.  
No WT allele; one 3-bp deletion; plus a complex (discontiguous)  4-bp deletion.  Germline transmission confirmed of both alleles at essentially mendelian rates.
10.9% WT, 44.9% 3-bp deletion, 33.7% 4-bp deletion. 
WT allele and 2 different mutant alleles present. Could not interpret mutations.
22.5% WT, 58.5 % 1-bp deletion, 9.7% 5-bp insertion.
WT allele and a 1-bp deletion allele.  Germline transmission confirmed.
60% WT, 30.2% 1-bp deletion. 
No WT allele, but multiple (>3) mutant alleles.
At least 4 different deletions of -2, -12, -28, -29 bp, each at low levels.
WT allele predominates.
75% WT;  7.4% 2-bp insertion; 8% 8-bp deletion.
WT allele predominates.
(Used #19 as control) 82.8% WT, 10.7% 4-bp deletion.

I was fairly impressed by the TIDE results.  First, it agreed with my specific interpretations for #6, 10 and 22, which were actually confirmed by germline transmission.   Second, it was able to correctly call 2 mutations at the same time in mouse #10.   Third, it made interpretations that made sense for founders #16 and 24, which I had given up on.    

Finally, I didn't really give the algorithm the optimal control sequence.  Instead I used the file for an apparently wild-type founder animal (#20).  However - when the files from #19 and #20 were used as controls to analyze each other, low levels of mutant alleles were detected.  And yes, if you go back to the chromatograms you can see a little underlying signal that may be a bit more than "usual" past the cleavage site - but it's very easy to miss.   This result is actually consistent with the experiment, since the embryos were injected with a PX330 plasmid, which may persist past the 1-cell stage and thus lead to low levels of mosaicism.      

Based on the imperfect controls I used, I would not take the TIDE quantitation of allele fractions literally.   However the qualitative results were pretty good and I wasn't able to find anything manually that TIDE didn't.  Also, this is a very fast analysis if you are performing sequencing on the PCRs anyway.   Moreover, it's easy to apply this analysis to PCRs on transfected pools of cells to measure CRISPR mutagenesis.  I'm looking forward to trying TIDE in this context as well.

Easy quantitative assessment of genome editing by sequence trace decomposition.  Eva K. Brinkman, Tao Chen, Mario Amendola and Bas van Steensel.    Nucleic Acids Research, 2014, Vol. 42, No. 22 e168


  1. I am working with a nickase approach using two different pair of sgRNAs to generate a deletion via Homologus recombination. I am not having good results so I will evaluate the efficiency of my sgRNA by TIDE, but I am a little confused about the optimal approach for TIDE analysis.

    First, since my final approach is an homologus recombination edition (which per se has a low efficiency) I am working with single cells, is possible to evaluate the efficiency of my sgRNAs in a bulk population???

    If yes, for my Sanger sequencing analysis I need to transfect both pairs of sgRNAs in the same reaction or only one sgRNA per reaction, I am asking this because in the web tool they ask for the Guide sequence, but using two sgRNAs results in two guide sequences.

    Thanks for your useful comments

  2. TIDE may work for batch analysis, but if your mutation efficiencies are low, like less than a few percent overall or for your specific edit of interest, it may not accurately call your mutation rate. You might be better off trying a T7 mismatch endonuclease assay, because this may be better suited to summarize the total mutation rate even if it is in the few percent range. Of course, the nickase approach is intended to limit unwanted mutations in the first place, so I understand the challenge you face. If you are still trying to get a handle on whether your sgRNAs work at all, I would test them in your cells using a standard non-nickase Cas9, which should readily enable NHEJ mutations with either sgRNA. These should be created at a higher rate than homologous recombination, and should be readily detectable by mismatch cleavage assays. If you can't detect mutations easily with this approach, your sgRNAs may not work efficiently in the first place and I would expect difficulties in detecting mutations when you move to the nickase approaches.