The Genomic Genius of SMRT Sequencing

Pacific Biosciences' Single-Molecule Real-Time Technology Enables Long Reads and High Throughput

Single-molecule real-time (SMRT®) sequencing allows researchers to explore regions of the genome that would be difficult or impossible to access through any other sequencing method. This bold assertion is made by SMRT sequencing’s developer, Pacific Biosciences (PacBio), which adds that SMRT technology can be used to generate data for full-length messenger RNA (mRNA) transcripts in transcriptome-wide or targeted studies. PacBio emphasizes that its technology has the unique ability to provide long reads, which can catch structural variations that elude short reads because long-read sequences are easier to map and assemble than short-read sequences.

SMRT sequencing harnesses the natural process of DNA replication, explains Jonas Korlach, Ph.D., PacBio’s CSO: “It is based on following DNA synthesis by the enzyme DNA polymerase in real time (currently, the sequencing speed is one to three bases per second), using nanostructure arrays and specially labeled nucleotides.”

PacBio’s sequencing method achieves very long reads that average more than 10,000 base pairs. That’s one to two orders of magnitude greater than traditional sequencing methods, and some SMRT reads have been as long as 80,000 base pairs. SMRT sequencing also eliminates DNA sequence context bias, allowing even difficult regions to be sequenced with high consensus accuracy (greater than 99.999%).

SMRT Sequencing Applications

With such long reads, researchers are relying on SMRT technology to access highly repetitive elements that cannot be sequenced with short-read technologies, Dr. Korlach says. These include low-complexity sequences or regions with extreme sequence context bias such as those rich in GC or AT content; tri-, tetra-, and pentanucleotide repeats; short tandem repeats (STRs); and variable number tandem repeats (VNTRs)—as well as microsatellites, telomeres, centromeric regions, and more.

“Many of these are important to understand biology and the underlying causes for many diseases,” notes Dr. Korlach. “Now they can be studied efficiently, often for the first time.”

According to Dr. Korlach, SMRT sequencing has led to “a new gold standard” in many research areas, including microbial genome assembly, microbial epigenetics, plant and animal genome assemblies, high-quality human reference genome assemblies, human leukocyte antigen (HLA) typing, metagenomic analysis (for full-length 16S sequencing), mRNA transcript isoform sequencing, and multilocus sequence typing. Transcriptomics analysis also may be conducted using Pacific Biosciences’ Iso-Seq™ protocol for RNA analysis.

This technology automatically detects DNA modifications, measuring the rate of DNA base incorporation during sequencing and characterizing epigenetic methylation information markers using special software. Consequently, scientists can use the technology to find locations of adenine and cytosine methylation in a DNA strand-specific and hypothesis-free manner, and identify methyltransferase recognition motifs.

Pacific Biosciences’ SMRT sequencing follows the activity of individual polymerase molecules in real time, using fluorescently tagged nucleotides (molecular structure). Sequencing occurs in arrays called SMRT cells (inset, right). The information from single polymerase molecules is extracted in temporal order to generate the individual long sequence reads, with average read lengths of greater than 10,000 bases (main image).

The Benefit of Long Reads

“The ability to generate long reads is a key and uniquely enabling aspect of our technology,” states Dr. Korlach. For example, the 10,000 base pair average read length “enables the creation of better reference genomes,” which are vital in addressing today’s environmental, agricultural, and evolutionary challenges.

“Short reads don’t allow resolution of important regions of the genome that contain repetitive or heterozygous sequence,” he continues. “Long reads are easier to map and to assemble than short-read sequences, and can reveal structural variations that might be missed by short reads. This greatly simplifies de novo assembly. Additionally, the data can be more easily ‘phased’ (assembled in a way in which the maternal and paternal chromosomes are distinguishable).”

With SMRT technology, researchers can directly sequence DNA without amplification and achieve uniform coverage. Other benefits, Dr. Korlach says, include the ability to “unambiguously align sequences, observe fully phased alleles, span repetitive elements and complex regions, resolve structural genetic variants, and catalog full-length mRNA isoforms,” all of which support the ultimate goal of uncovering the functional biology of organisms.

Sequel Sequencing Upgrades

In 2015, PacBio launched the Sequel System, a product representing a new generation of technology. Since then, the company has extended the Sequel System’s capabilities to give scientists the ability to generate whole-genome de novo assemblies rapidly and cost-effectively.

The latest improvements include updated sequencing chemistry; optimized sample clean-up protocols; and enhanced software to improve read lengths, throughput, and accuracy. “The Sequel System’s increased throughput should also facilitate applications of SMRT technology in metagenomics and targeted gene applications for which interrogation of larger numbers of individual DNA molecules is important,” Dr. Korlach tells GEN.

Easier Access to SMRT Technology

To support the increasing demand for SMRT sequencing and project funding, PacBio launched the Genome Galaxy Initiative, which facilitates the crowdsourced funding of sequencing projects of all sizes. The initiative uses the Experiment.com platform to connect researchers directly with the public to solicit support for specific open-access genome projects.

PacBio’s technology is sold throughout the world by direct sales and through distributors. In addition to life sciences, the company also works with the food and energy industries to contribute to advances that will, Dr. Korlach insists, “create long-lasting benefits for the world and its future.”

 

Pacific Biosciences

Location: 1380 Willow Road, Menlo Park, CA 94025
Phone: (650) 521-8000
Website: www.pacb.com
Principal: Michael Hunkapiller, Ph.D., President and CEO
Number of Employees: 394
Focus: PacBio develops long-read sequencing technology and incorporates it into systems that can help scientists explore otherwise inaccessible genomic regions, facilitating solutions to genetically complex problems.