June 1, 2018 (Vol. 38, No. 11)
Synthego Developed a Tool Called ICE to Be More Efficient Than Other Methods
CRISPR-based genome engineering revolutionized the gene-editing field by making experimental workflows considerably easier, faster, and more efficient than previous methods. Still, generating reliable results from CRISPR-editing data requires the help of robust software tools.
Synthego has developed a new tool called ICE (short for Inference of CRISPR Edits) for analyzing CRISPR experiments. ICE was initially created to support the CRISPR analysis needs of Synthego’s scientists who discovered that there were no other suitable software tools available. Synthego took what it learned from analyzing the many CRISPR experiments it performed to build the ICE analysis tool. It’s now free for everyone to use at ice.synthego.com.
ICE was rigorously evaluated by analyzing thousands of edits performed over multiple experiments and comparing the robustness, accuracy, and speed with existing tools. For example, researchers examined ICE analysis of Sanger sequencing data alongside the analysis of next-generation sequencing (NGS)–based amplicon sequencing data and found that the accuracy of ICE analysis results was highly comparable (with R2 = 0.96 or better) to that of NGS data.
Advantages of ICE
NGS-Quality Editing Analysis with Low-Cost Sanger Sequencing. ICE uses Sanger sequencing data to produce quantitative, NGS-quality analysis of CRISPR editing to enable a ~100-fold reduction in cost relative to NGS-based amplicon sequencing.
To use the tool, you simply upload your Sanger sequencing files and indicate the guide RNA sequence(s) you used. ICE will calculate overall editing efficiency and determine the profiles of all the different types of edits that are present and their relative abundances.
ICE also provides a knockout score (KO-Score) which represents the proportion of cells that have either an indel that causes a frameshift or 21 + bp indel. This score is a useful measure for those who are interested in understanding how many of the indels are likely to result in a functional KO of the targeted gene.
Analyze Complex Edits. Traditional Sanger sequencing–based analysis tools are unable to detect or analyze complex edits, such as those generated by delivering multiple sgRNAs to the cells at once—referred to as “multiplexing”—a commonly used strategy for creating functional knockouts or large deletions so the ability to analyze and interpret data about these types of edits is critical to the CRISPR workflow.
We addressed this limitation in our ICE algorithm by enabling the analysis of edits resulting from multiplexed single-guide RNAs. ICE also includes visual representations of all detected edit types in the sample. The visualization helps to illustrate which of the multiplexed sgRNAs was involved in the particular edit and how it was involved.
How ICE Works
Following delivery of CRISPR components into target cells, genomic DNA from both edited and unedited (control) populations is subjected to PCR amplification and Sanger sequencing. ICE compares these sequence traces to give a detailed analysis of CRISPR editing. ICE software expresses as a percentage the number of genomes that have been successfully modified with insertions or deletions (indels) and then characterizes the sequence and abundance of each particular indel.
To use the ICE software tool, upload your Sanger sequencing files and provide basic information such as your sample names and guide sequences. ICE will do the rest. There are no parameters that need optimizing and no complicated steps to learn. For increased flexibility and scalability, the ICE software has two analysis formats: “sample by sample” analysis, which can compare up to five editing experiments at a time, and “batch” analysis, which compares hundreds of samples simultaneously.
Overview of ICE Editing Analysis
Once the analysis is complete, a new window appears with a graphical representation of the results and a list of the analyzed samples (Figure 1).
If the sample run had no issues, the analysis window shows a green checked circle in front of the sample name. If there was a minor error during processing, the window shows a yellow checked circle. If there are no results or there was a processing error, you will see a red exclamation point in front of that sample. You can hover over the yellow or red circles to gather details on the issues associated with each sample.
Successfully analyzed samples will display the following parameters:
• Sample—The unique label name for each sample.
• Guide Target—The 17–23-nucleotide sequence of the DNA-targeting region of the guide RNA, excluding the PAM sequence.
• PAM Sequence—The Protospacer Adjacent Motif (PAM) sequence for the nuclease used. Currently, ICE is configured for the Cas9 nuclease from Streptococcus
pyogenes (SpCas9).
• ICE—The editing efficiency (percentage of the pool with non-wild-type sequence) as determined by comparing the edited trace to the control trace. In the ICE algorithm, potential editing outcomes are proposed and fitted to the observed data using linear regression.
• R2—When the ICE linear regression is computed during generation of the ICE Score, the Pearson correlation coefficient (R2) is also computed and reported. The higher the R2 value, the more confident you can be in the ICE score.
• KO-Score—Represents the proportion of cells that have either a frameshift or 21 + bp indel likely to result in a functional KO of the targeted gene.
You can perform more in-depth analyses on each sample by clicking on the sample name or on its corresponding bar graph entry. Clicking to initiate the in-depth analysis will open a new window with three tabs. Each of the three tabs (Contributions, lndel Distribution, and Traces) provides particular details about the indel profile of the edited sample.
ICE Analysis Details
The “Contributions” tab lists the sequences present in your edited population and their relative representation. The black vertical dotted line represents the cut site, and the “+” symbol on the far left marks the wild-type sequence. If you are viewing a multiplex sample, the cut site will be aligned to the most upstream cut site (Figure 2).
In the “Indel Distribution” tab, you’ll find an indel plot which displays the distribution of indel sizes in the entire edited population. Hovering over each bar of the Indel plot shows the size of the insertion or deletion (±1 or more nucleotides), along with the percentage of genomes that contain it.
The discordance plot shows the level of disagreement between the non-edited wild-type (control) and the edited sample in the inference window (the region around the cut site). It shows, base-by-base, the average amount of signal that disagrees with the reference sequence derived from the control trace file.
On the plot, the green (edited sample) and orange (control sample) lines should be close together before the cut site, and a typical CRISPR edit results in a jump in the discordance near the cut site and continuing after the cut site, representing a high level of sequence discordance (Figure 3).
Figure 3. The ICE tool’s “Indel Distributions” tab directs users to an Indel plot that displays the inferred distribution of indel sizes in the entire edited population of genomes. Hovering over each bar of the Indel plot shows the size of the insertion or deletion (±1 or more nucleotides), along with the percentage of genomes that contain it. The discordance plot shows the level of disagreement between the non-edited wild type (control) and the edited sample in the inference window (the region around the cut site).
The “Traces” tab shows the edited and control Sanger traces in the region around the guide binding site(s). The sequence base calls from the .ab1 file are also shown above each trace. The horizontal black underlined region represents the guide sequence, and the horizontal red underline is the PAM site. The vertical black dotted line represents the cut site. Cutting and error-prone repairs typically result in mixed sequencing bases downstream of the cut (Figure 4).
Figure 4. The ICE tool’s “Traces” tab shows the edited and control, nonedited Sanger traces in the region around the guide binding site(s). The sequence base calls from the .ab1 file are also shown above each trace. The horizontal black underlined region represents the guide sequence, and the horizontal red underline is the PAM site. The vertical black dotted line represents the cut site. Cutting and error-prone repair typically result in mixed sequencing bases downstream of the cut.
In summary, ICE generates NGS-quality CRISPR editing analysis from Sanger sequencing data. Furthermore, ICE can analyze more types of editing experiments than other Sanger sequencing–based software tools and is faster and easier to use. ICE is also completely free for everyone to use, and we have made the algorithm code open source and free for all nonprofit uses.
Jessica Roginsky ([email protected]) is scientific support lead at Synthego.