Wednesday, January 21, 2015

TIDE: an online tool for evaluating #CRISPR gene editing in sequence trace files.

TIDE is a neat new web tool that's designed for a specific problem that I've definitely been dealing with. Following a CRISPR experiment, either in cell lines or animals, it's not trivial to quantify how well editing/mutagenesis worked and what sort of mutations were generated.   This is well summarized in the introduction of this paper so I won't repeat that, but I have certainly had these situations:  first, staring at ABI chromatograms following sequencing of PCR products from founder mice, and second, trying to quantify cleavage in pools of transfected cells.     Of course, the target site PCRs are going to usually contain mixtures of molecules with different mutations, and likely some amount of wild-type allele (for sure in pooled cells, often in founder animals).   So direct sequencing is hard to interpret as the actual chromatogram data past the cleavage site is usually a jumble of overlapping staggered sequences.

What TIDE does is actually to quantify the underlying non-wild-type sequence signal in the chromatogram data 3' to the expected cleavage site, then it quantifies the apparent contribution of specific, underlying mutant alleles, based on the pretty good assumption that most of the mutations generated by CRISPR will be short indels.  This seems to be a extension of PolyPeakParser, which I blogged about previously, but it's able to deal with multiple mutant alleles.  

I had a recent data set of sequence files from a mouse CRISPR experiment, so I thought I'd compare our independent analysis of the founder mice to TIDE's interpretation.  The gene is anonymized but I can state that it was a straightforward attempt to create indel mutations in a gene of interest.    Here's what we did:   About 25% of pups were positive for new mutations as revealed by Surveyor assays.  We then sequenced PCR products on 8 founder littermates, of which 6 were known Surveyor-positive and 2 were of unknown status.    

The last 2 (#19, #20) mice had normal, wild-type sequencing data.   The other 6 mice had very jumbled sequences past the cleavage site.   After some serious staring at the chromatograms - which took a while - I made some guesses that some of them had specific indel mutations.  However some of them were just too complex for me to figure out.      

Then I analyzed all of them with TIDE, using the sequence file from wild-type mouse #20 as the control file (which TIDE requires).   Here's the results:


Pup #
Pre-TIDE manual interpretation
TIDE result
2
WT allele and at least 2 different mutant alleles present. Could not interpret mutations at all.
No significant results, but the sequence quality was rather poor to begin with.   
6
WT allele and a 1-bp deletion allele. Germline transmission confirmed.
66.5% WT, 24.8 % 1-bp deletion.  
10
No WT allele; one 3-bp deletion; plus a complex (discontiguous)  4-bp deletion.  Germline transmission confirmed of both alleles at essentially mendelian rates.
10.9% WT, 44.9% 3-bp deletion, 33.7% 4-bp deletion. 
16
WT allele and 2 different mutant alleles present. Could not interpret mutations.
22.5% WT, 58.5 % 1-bp deletion, 9.7% 5-bp insertion.
22
WT allele and a 1-bp deletion allele.  Germline transmission confirmed.
60% WT, 30.2% 1-bp deletion. 
24
No WT allele, but multiple (>3) mutant alleles.
At least 4 different deletions of -2, -12, -28, -29 bp, each at low levels.
19
WT allele predominates.
75% WT;  7.4% 2-bp insertion; 8% 8-bp deletion.
20
WT allele predominates.
(Used #19 as control) 82.8% WT, 10.7% 4-bp deletion.

I was fairly impressed by the TIDE results.  First, it agreed with my specific interpretations for #6, 10 and 22, which were actually confirmed by germline transmission.   Second, it was able to correctly call 2 mutations at the same time in mouse #10.   Third, it made interpretations that made sense for founders #16 and 24, which I had given up on.    

Finally, I didn't really give the algorithm the optimal control sequence.  Instead I used the file for an apparently wild-type founder animal (#20).  However - when the files from #19 and #20 were used as controls to analyze each other, low levels of mutant alleles were detected.  And yes, if you go back to the chromatograms you can see a little underlying signal that may be a bit more than "usual" past the cleavage site - but it's very easy to miss.   This result is actually consistent with the experiment, since the embryos were injected with a PX330 plasmid, which may persist past the 1-cell stage and thus lead to low levels of mosaicism.      

Based on the imperfect controls I used, I would not take the TIDE quantitation of allele fractions literally.   However the qualitative results were pretty good and I wasn't able to find anything manually that TIDE didn't.  Also, this is a very fast analysis if you are performing sequencing on the PCRs anyway.   Moreover, it's easy to apply this analysis to PCRs on transfected pools of cells to measure CRISPR mutagenesis.  I'm looking forward to trying TIDE in this context as well.

Easy quantitative assessment of genome editing by sequence trace decomposition.  Eva K. Brinkman, Tao Chen, Mario Amendola and Bas van Steensel.    Nucleic Acids Research, 2014, Vol. 42, No. 22 e168

No comments:

Post a Comment