Genome sequencers have been tweaking polymerase chain reaction (PCR) amplification to avoid introducing artifacts into sequencing libraries, ranging from modifications in chemistries to the introduction of novel sequencing technologies that could obviate the need for PCR altogether.
PCR-related problems have included uneven amplification, causing overrepresentation of some sequence species and nucleotide misincorporation. In particular, sequencing genomes or genomic regions with extremely biased base composition remains a challenge to the currently available next-generation sequencing (NGS) platforms. These include, for example, the genomes of important pathogenic organisms like Plasmodium falciparum with a high adenine–thymine (AT) content and Mycobacterium tuberculosis with high guanine–cytosine (GC) content.
These genomes have proven difficult for sequencers because the standard library preparation procedures that employ PCR amplification cause uneven read coverage across these regions, leading to problems in genome assembly and variation analyses.
Mapping and Assembly of GC-Biased Genomes
In 2009, Iwanka Kozarewa, Ph.D., and colleagues (“Amplification-Free Illumina Sequencing Library Preparation Facilitates Improved Mapping and Assembly of (G+C)-Biased Sequences,” Nature Methods 2009;6:291–295, doi:10.1038/nmeth.1311) described an amplification-free Illumina sequencing library preparation that they said allowed improved mapping and assembly of GC-biased genomes. Their method, in which the cluster amplification step rather than the PCR enriches for fully ligated template strands, reduced the incidence of duplicate sequences, improving read mapping and single-nucleotide polymorphism calling and aiding de novo assembly clusters from the DNA single molecules.
The scientists provided a proof of principle by generating and analyzing DNA sequences from extremely (G+C)-poor (Plasmodium falciparum), (G+C)-neutral (Escherichia coli), and (G+C)-rich (Bordetella pertussis) genomes. At the heart of this method are no-PCR adapters that contain additional sequences, allowing hybridization of templates directly to a flow-cell surface. Incompletely ligated fragments are inert in the cluster amplification step. Thus, they reported it is not necessary to retain the PCR step to enrich for properly ligated fragments, but they note that to obtain an optimal cluster density, it was necessary to quantify accurately only those template fragments with an adapter at either end, achieved by quantitative PCR, using primers that target the adapter regions.
By ligating adapters that consist of all sections required for sequencing primer annealing and attachment to the flow-cell surface, the investigators say they can avoid the requirement for a PCR step. The authors say that quantity of template DNA generated in this way is lower than when PCR is employed, but library quantification by quantitative PCR (qPCR) showed that from 5 μg of starting DNA a sufficient 200-bp no-PCR library can be obtained for >400 high-density GA lanes, more than enough for most sequencing purposes.
Because of the absence of the PCR step, the method “is quicker to perform” than the standard Illumina library prep and said they felt “it should be employed routinely in the preparation of libraries for Illumina sequencing.”
Illumina’s Rooz Golshani, Ph.D., product manager, DNA Library Prep, told GEN that “ Our PCR-free product, the TruSeq® DNA PCR-Free Library Preparation Kit, is “our gold standard and is the optimal solution for generating the most complete genome for complex large genomes, such as human whole genomes.”
Dr. Golshani noted that investigators most frequently use the kit in preparing genomic DNA from complex eukaryotic genomes in which a user needs to remove bias associated with PCR. “Specifically, it provides greater insight into coding, regulatory, and intronic regions. The library prep process generates blunt end fragments from mechanically sheared DNA which are then size selected and biochemically prepared for a subsequent ligation process with Illumina adaptor indexes without the need for any PCR amplification.”
PCR-Free–Based Approaches in Whole-Genome Sequencing
In a 2015 article (“Library Preparation Methodology Can Influence Genomic And Functional Predictions In Human Microbiome Research,” Proc Natl Acad Sci USA 2015;10;112:14024-9, doi: 10.1073/pnas.1519288112), Marcus B. Jones, Ph.D., at the J. Craig Venter Institute and Human Longevity, and colleagues noted that as whole-genome sequencing (WGS) was widely adopted into human microbiome research (the microbes living in the human body) study results often prove “conflicting or inconclusive.”
And, they said, that while new library chemistries and approaches provide novel low-cycle PCR and PCR-free tools, these tools may introduce unanticipated artifacts in the data.
Focusing on library preparation, these investigators performed quantitative and qualitative analyses comparing WGS metagenomic data, or direct genetic analysis of genomes contained with an environmental sample, from human stool specimens using the Illumina Nextera XT and Illumina TruSeq DNA PCR-Free kits, and the KAPA Biosystems Hyper Prep PCR and PCR-Free systems. They observed significant differences in taxonomy among the four different NGS library preparations using a DNA mock community and a cell control of known concentration. They also observed biases in error profiles, duplication rates, and loss of reads representing organisms that have a high %G+C content that can significantly impact results.
On the basis of their analyses, they proposed that scientists working in the microbiome community consider adoption of PCR-free–based approaches (such as Kapa Hyper Prep PCR-Free and TruSeq DNA PCR-Free) to reduce PCR bias in calculations of abundance and to improve assemblies for accurate taxonomic assignment. Their study results, they said, highlight “a critical need for consistency in protocols and data analysis procedures, especially when attempting to interpret human microbiome data for human health.”
All PCR-free techniques were developed on Illumina sequencing machines. The company clearly controls the DNA sequencing market, as its machines generate more than 90% of all DNA sequence data.
Nanopore Technology-Based Sequencing
But “disruptive” sequencing based on nanopore technology has the potential to change completely the way DNA sequencing is done and could change Illumina’s dominant market position. And other big companies, like Roche, which failed in its 2012 attempt to acquire Illumina, are betting that nanopores could be the technology to break the company’s monopoly.
Developed by Oxford Nanopore Technologies, the cell-phone sized MiniION sequencing device may eventually obviate the need for PCR altogether. The MiniION reads out long DNA stretches, although currently the samples still require library preparation prior to sequencing—a process that has yet to be optimized. This approach has applications in scaffolding genome sequences assembled from short reads and resolving repeat sequences or haplotypes because it is able to span ambiguous regions in a single read. Future developments may include use in real-time medical diagnostics and forensics, as well as prospective applications as an environmental DNA sensor.
But U.K. investigators (“Assessing the performance of the Oxford Nanopore Technologies MinION,” Biomol Detect Quantif 2015;3: 1–8, doi: 10.1016/j.bdq.2015.02.001) say that the MinION’s performance and limitations with regard to throughput and accuracy require evaluation prior to its wide adoption. Tom Laver, Ph.D., and colleagues at the University of Exeter and the Wellcome Trust Biomedical Informatics Hub reported they had assessed the MinION’s performance by resequencing three bacterial genomes, each with very different nucleotide compositions ranging from 28.6% to 70.7%; the high-G+C strain was underrepresented in the sequencing reads.
The investigators estimated MiniION’s error rate after base calling to be 38.2%. Mean and median read lengths were 2 kb and 1 kb, respectively, whereas the longest single read was 98 kb. The whole length of a 5 kb rRNA operon was covered by a single read.
The authors concluded, however, that while the current error rate limits the ability of MinION to compete with existing sequencing technologies, they did show that MinION sequence reads can enhance contiguity of de novo assembly when used in conjunction with Illumina MiSeq data.
Emerging publications by pilot users have shown that a MinION can do quite a lot on its own. It can reliably sequence small genomes, such as those of bacteria and yeast. It can discriminate between closely related bacteria and viruses, read complex portions of the human genome, and differentiate between the two versions of a gene that are carried on each chromosome pair.
On February 23, Illumina announced it had filed lawsuits against Oxford Nanopore Technologies Ltd. and Oxford Nanopore Technologies,Inc. claiming that it owns exclusive licenses to the enabling technology in Oxford’s MinION and PromethION devices. The lawsuits are based on U.S. Patent Nos. 8,673,550 and 9,170,230, which are entitled “MSP NANOPORES AND RELATED METHODS.” Illumina says it had exclusively licensed the patents in the field of nucleic acid sequencing from the UAB Research Foundation and the University of Washington.
But as knowledgeable sequencers have noted, “Illumina sequencing benefits from a very strong data analysis ecosystem, with megaprojects such as 1000 Genomes, that developed full bioinformatics suites tailored to the specification of this technology. To pick a fight with Illumina’s home field advantage of medical sequencing is going to be extremely hard.”
But despite the cell-phone sized instruments’ current shortcomings, infectious disease scientists have already demonstrated some pretty attention-getting results using the MinION. In the December 15 issue of Genome Medicine (“Rapid metagenomic identification of viral pathogens in clinical samples by real-time nanopore sequencing analysis,” Genome Med 2015;7:99, doi: 10.1186/s13073-015-0220-9), investigators at the Department of Laboratory Medicine at the University of California, San Francisco and colleagues, reported unbiased metagenomic detection of chikungunya virus (CHIKV), Ebola virus (EBOV), and hepatitis C virus (HCV) from four human blood samples by MinION nanopore sequencing coupled to a newly developed, web-based pipeline for real-time bioinformatics analysis on a computational server or laptop (MetaPORE).
The authors said that at titers ranging from 107 to 108 copies per milliliter, reads to EBOV from two patients with acute hemorrhagic fever and CHIKV from an asymptomatic blood donor were detected within 4–10 minutes of data acquisition, whereas the lower-titer HCV virus (1×105 copies per milliliter) was detected within 40 minutes. Analysis of mapped nanopore reads alone, despite an average individual error rate of 24% (range 8–49 %), permitted identification of the correct viral strain in all four isolates, and 90% of the genome of CHIKV was recovered with 97–99% accuracy
Illumina may have to pick another fight with Roche to defend its turf. Roche spent $125 million to acquire Genia Technologies in June of 2014, a company developing a single-molecule, semiconductor-based DNA sequencing platform using its own nanopore technology. Developed in collaboration with Columbia University and Harvard University, the method uses a DNA replication enzyme to sequence a template strand with single-base precision as base-specific engineered tags cleaved by the enzyme are captured by the nanopore. When the cleaved tags travel through the pore, they attenuate the current flow across the membrane in a sequence-dependent manner.
And infectious disease scientists would really like to have a hand-held sequencer that works reliably. Charles Chiu, M.D., Ph.D., associate professor of laboratory medicine, division of infectious diseases at University of California, San Francisco and a co-author of the 2015 Genome Medicine paper, says “To our knowledge, this is the first time that nanopore sequencing has been used for real-time metagenomic detection of pathogens in complex clinical samples in the setting of human infections. Unbiased point-of-care testing for pathogens by rapid metagenomic sequencing has the potential to radically transform infectious disease diagnosis in both clinical and public health settings.”