Understanding, and quantifying, the genetic heterogeneity of a cell population is essential to the understanding of biological systems. Although new technologies are being developed to do this, in particular in the field of early cancer detection, the majority of current sequencing techniques lack the sensitivity to detect rare mutations in a pool of cells. Researchers have now developed an approach—targeted individual DNA molecule sequencing (IDMseq)—that can accurately detect a single mutation in a pool of 10,000 cells.

The method labels individual DNA molecules for single-base-resolution haplotype-resolved quantitative characterization of diverse types of rare variants, using both short- or long-read sequencing platforms. Importantly, the team successfully used IDMseq to determine the number and frequency of mutations caused by the gene editing tool, CRISPR/Cas9, in human embryonic stem cells. This tool is the first quantitative evidence of persistent nonrandom large structural variants and an increase in single-nucleotide variants at the on-target locus following repair of double-strand breaks induced by CRISPR-Cas9 in human embryonic stem cells.

The work is published in Genome Biology in the paper, “Long-read individual-molecule sequencing reveals CRISPR-induced genetic heterogeneity in human ESCs.

Assistant professor Mo Li, PhD, works on sequencing library preparation.
[© 2020 KAUST Jinna Xu]

Clinical trials are underway to test CRISPR’s safety to treat some genetic diseases. A sequencing approach that can home in on a rare mutation within a large number of cells would have implications on the safety of CRISPR genome editing.

“Our study revealed potential risks associated with CRISPR/Cas9 editing and provides tools to better study genome editing outcomes,” said Mo Li, PhD, assistant professor, bioscience at King Abdullah University of Science and Technology (KAUST) in the Kingdom of Saudi Arabia, who led the study.

IDMseq is a sequencing technique that involves attaching a unique barcode to every DNA molecule in a sample of cells and then making a large number of copies of each molecule using PCR. Copied molecules carry the same barcode as the original ones. The team compared the method using both Illumina and Oxford Nanopore Technologies (ONT) platforms.

A bioinformatics tool kit, called variant analysis with unique molecular identifier for long-read technology (VAULT), then decodes the barcodes and places similar molecules into their own “bins,” with every bin representing one of the original DNA molecules. VAULT uses a combination of algorithms to detect mutations in the bins. The process works especially well with third-generation long-read sequencing technologies and helps scientists detect and determine the frequency of all types of mutations, from changes in single DNA letters to large deletions and insertions in the original DNA molecules.

The approach successfully detected a deliberately caused gene mutation that was mixed with a group of wild-type cells at ratios of 1:100, 1:1,000, and 1:10,000. It also correctly reported its frequency.

The sequencing setup for the study: an Oxford Nanopore sequencer and a laptop computer. The screen in the background shows the DNA strand fed through the sequencer. [© 2020 KAUST Mo Li.]

The researchers also used IDMseq to look for mutations caused by CRISPR/Cas9 genome editing. “Several recent studies have reported that Cas9 introduces unexpected, large DNA deletions around the edited genes, leading to safety concerns. These deletions are difficult to detect and quantitate using current DNA sequencing strategies. But our approach, in combination with various sequencing platforms, can analyze these large DNA mutations with high accuracy and sensitivity,” said Chongwei Bi, a graduate student and author on the paper.

The authors noted that IDMseq provides an “unbiased single-base-resolution characterization of on-target mutagenesis induced by CRISPR-Cas9,” which could facilitate the experimental design and safe use of the CRISPR technology in the clinic. IDMseq can currently sequence only one DNA strand, but work to enable double-strand sequencing could further improve performance, the researchers said.

The tests found that large deletions accounted for 2.8–5.4% of Cas9 editing outcomes. They also discovered a three-fold rise in single-base DNA variants in the edited region. “This shows that there is a lot that we need to learn about CRISPR/Cas9 before it can be safely used in the clinic,” said Yanyi Huang, ScD, associate director of Peking University.

Previous articleCord Blood Registry (CBR) and NantKwest
Next articleSleep Timing Disrupts Brain’s Waste Disposal, May Increase Risk of Neurological Disorders