February 15, 2018 (Vol. 38, No. 4)
Randi Hernandez Managing Editor GEN
An Orthogonal Approach to Genome Sequencing
(Part I of a two-part series)
Intense competition in the genomic-sequencing space continues, but not at a level that once fueled the field. Could progress in sequencing tools and research overseas help democratization of sequencing in the United States?
The $1,000 genome—or even the $100 genome—could be closer to becoming a reality than ever before. Thanks in large part to the prowess of Illumina, the cost of DNA sequencing has decreased by four orders of magnitude during the period 2007–2012.1
But, in the past few years, the rate at which the cost of sequencing has been dropping has plateaued. Competition has also slowed, experts say.1,2
For the past decade, since the acquisition of the British firm Solexa in 2006, Illumina has been the dominant player in the market. Illumina claims to be the platform that has delivered the $1,000 genome and produced more than 90% of the world’s genome sequence data. Over the years, Illumina has fought off competition from Life Technologies and Ion Torrent (Thermo Fisher Scientific), 454 Life Sciences (Roche), and other companies with sequencing systems, including those developed by Complete Genomics and Helicos BioSciences. Some of these technologies are off the market or have been gobbled up by competitors. But two platforms that feature longer individual reads—sequencers from Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT)—offer researchers some important advantages, particularly for work on complex genomes.
“It’s hard to say who Illumina’s ‘fiercest’ competitor is because Illumina is so dominant,” observed Shawn Baker, a genomics advisor and consultant at SanDiegOmics ([email protected]). “They have lots of competition, but each competitor only competes in certain niches—nobody covers the entire sequencing space like Illumina.” Baker said if he had to choose, Ilumina’s biggest competitor is probably Thermo Fisher, although “PacBio, ONT, and BGI are also competitors in certain markets.”
Investigators in the United Kingdom achieved the first assembly of a 30× (30-fold coverage) human genome using primarily ONT long-read data (according to an article published in Nature Biotechnology).3 A tweet on January 12, 2018 from ONT’s chief technology officer, Clive Brown, announced the achievement of a milestone using its PromethION instrument: runs equivalent to 30× coverage (human) for less than $1,000.4
As Brown told GEN exclusively: “The market leader, Illumina, will no longer have the only shippable instrument system capable of sub-$1,000 human genome sequencing. This will be the beginning of a potential threat to their market dominance.”
Are We There Yet?
Many commentators (including GEN contributor Kevin Davies, Ph.D., in his book, The $1,000 Genome) have predicted that the $1,000 genome would eventually become a reality. Illumina Executive Chairman Jay Flatley predicted Illumina would cross that threshold a few years ago. But it’s 2018, and whether the field is truly there depends on how you do the math.
Baker thinks that Illumina’s NovaSeq S4 flow cell—one of the company’s newer models—will enable users to generate whole human genomes for approximately $800. Illumina’s Director of Product Marketing Joel Fellis, Ph.D., told GEN at the American Society of Human Genetics (ASHG) conference in 2017 that Illumina sees a path to the $100 genome, but did not offer details about how (or when) the company would achieve that major milestone.
Dr. Fellis admitted that for labs not operating NovaSeq at full capacity, the cost of reagents alone could easily be $800.5 So, for labs not running the NovaSeq at full capacity, the $1,000 genome is not as realistic as for research centers operating at scale.
Illumina’s top-of-the-line platform, the NovaSeq 6000 System, costs almost $1 million. Despite that hefty price, the company says its newest products that support the NovaSeq technology pave the way for “large-population-scale initiatives at the lowest price per sample.”6 Dr. Fellis noted that one-third of Illumina’s customers are new and estimated that 90% of the sequencing data that exist have been generated using Illumina’s machines.
Costs aside, there is plenty of evidence supporting the clinical utility of sequencing. Many recent studies have demonstrated that incorporating sequencing equipment at the research level will, in fact, have a major impact on clinical decision-making. And research-driven genomics will eventually transform into healthcare-driven calculations.
One study, which was conducted by Birney et al.,7 asserted that genomic analysis would have the most impact in certain disease states, including infectious disease, rare disease, cancer, and common or chronic disease. In fact, according to this study, genomics will be used in 70% of cancer diagnoses by 2027—unless the cost of sequencing plummets further, in which case, genomics will likely play a pivotal role in oncology prior to 2027.
The Long and Short of It
A lab’s sequencing strategy depends on a handful of factors. For example, does the research team have a good idea about where a mutation is likely to occur or have a specific gene variant in mind to examine? If so, short-read sequencing—individual reads of 50–250 bases—is likely ideal. Most human exome- and whole-genome sequencing (WGS) is performed using Illumina’s short-read platform.
If researchers are interested in assembling the whole genome of an organism de novo—sequencing DNA that is highly repetitive or obtaining haplotype information—long reads are much preferred.
Cost may also be a consideration when investigators are thinking about their sequencing needs. According to Andy Felton, Ph.D., head of product development at Ion Torrent (a Thermo Fisher Scientific product), reagent costs rise when you go from looking at exomes to WGS. He told GEN at ASHG 2017 that restricting a project’s scope to targeted sequences cuts down on complexity and cost.
Whether looking at the whole genome or just parts of it, experts argue that once long reads come down in price, short reads won’t even be used—at least not as commonly as they are today.
“There are claims of which is better, short or long, but ultimately, conflicts of interest arise among providers that do their own analyses on sequencing performance,” says Vural Özdemir, M.D., Ph.D., editor-in-chief of the journal OMICS: A Journal of Integrative Biology.
Most research scientists, in principle, might be interested in long reads. “But if you have to take a stance,” maintains Dr. Özdemir, “I would opt for the short reads at the moment, both for cost and clinical utility.”
As mentioned, clinical labs need throughput and accuracy, and have historically relied on Illumina (and a bit of traditional Sanger sequencing for confirmation). But WGS is not always necessary to gain useful information about a disease. Looking at only parts of the genome through short reads can be informative, as well.
Short Reads: Are They Still Worth the Time?
When sequencing focuses on known genes in a disease area, short reads are a preferable method of analysis. According to Pan Zhang, Ph.D., M.D., director of the Sequencing and Microarray Center at the Coriell Institute for Medical Research, “Genome-wide associated studies have not proven fruitful for hearing-loss research. For this poorly understood genetic condition, you need a discovery tool that is both rapid and cost-effective for analyzing multiple genes in depth in many samples,” said Dr. Zhang in a Thermo press release.8
Thermo is leveraging its tool as one that is specifically tailored to less data-intensive applications, and for identifying targeted sequencing in a clinical setting, such as how it was used in the aforementioned hearing-loss study. Thermo is taking this stance mostly because, according to Baker, “the platform was never able to scale up enough to compete with Illumina’s larger instruments.” Baker adds that the Ion Torrent platform has a higher error rate and a known difficulty with homomers.
David Smith, M.D., director of the Mayo Clinic’s Technology Assessment Group at the Center for Individualized Medicine, says that while widely used, short reads have two major disadvantages: sequence accuracy is not great, and the technique produces a fraction of output that has to be assembled after multiple reads.
Competition and New Market Entrants
For a picture of Illumina’s dominance in the sequencing landscape, simply plug the company name into PubMed: In 2017, there were more than 2,500 mentions of “Illumina”, 352 studies mentioning “PacBio” or “Pacific Biosciences”, approximately 78 studies mentioning “Oxford Nanopore Technology”, and some 90 studies comparing at least two of the companies among the big three (and it’s mostly an Illumina vs. PacBio conversation). Thermo’s “Ion Torrent” garnered approximately 178 mentions that year. Qiagen’s “GeneReader”, one of the newest market entrants, was only mentioned in two studies during 2017.
The new GeneReader tool is interesting, says Baker—not because of the technology itself, but the way Qiagen is positioning its tool in the market. “Rather than competing against Illumina on any technical attributes or even price, Qiagen has taken the strategy of offering a ‘sample to answer’ solution by combining the sequencer with other products in the sequencing workflow, including sample prep and data analysis.” He adds, “This, along with a low-throughput sequencer best suited for single samples, makes this an intriguing option for the clinical market.”
China’s BGI is another company that intends to disrupt the field of sequencing. In December 2017, BGI announced9 that it will partner with Sanguine BioSciences, a company that creates patient databases, to link more than 1,000 records on patients with rheumatoid arthritis who will have their genomes sequenced using the BGISeq NGS platform (the platform is based on the technology developed by Complete Genomics, which BGI acquired a few years ago).
Unlike other sequencers, which rely on emulsion PCR for amplification—a “horrible way to amplify DNA” that introduces errors, says Dr. Smith—the BGI sequencer uses “nanoballs” to amplify DNA on billions of tiny balls affixed to the surface of slides. The use of nanoballs is associated with a slightly lower error rate than emulsion PCR.
Dr. Smith believes BGI’s technologies could alter the tool landscape. “BGI offers a WGS service that now costs $600 for everything, and they are planning on reducing that to $100 in the next two years. When the total cost of WGS is just $100, that is a big deal, as WGS then would become the one cheap and affordable test for everything.”
BGI plans to launch its instrument first in China, where there are fewer patent potentialities, and then in the United States. For now, access in the United States may be solely in the clinic—and Dr. Smith said BGI is considering giving a bunch of the sequencers to the Mayo clinic to use. Moves like this, he indicated, will threaten Illumina and push the company to drop its prices.
In January 2018, Thermo Fisher Scientific introduced the Ion GeneStudio S5Series, a chip-based sequencer for targeted NGS experiments.10 Another new-ish entrant, Stratos Genomics, will introduce its Sequencing by Expansion (SBX™) technology, which is based on nanopores. The company has raised $20 million in venture capital to be used for “final system development leading to commercial introduction of the company’s proprietary sequencing platform.”11
Another company poised to make waves in the sequencing world is Roswell Biotechnologies in San Diego, CA. Roswell claims it can sequence 30× human genomes for just $100, with 10-kb read lengths and 99.9% accuracy. The company is developing chip-based molecular electronics for DNA sequencing; the tool reportedly measures changes in electric current as the DNA polymerase glues nucleotides together.12
So, why is Illumina dominating the sequencing market, largely unchallenged?
“Illumina has a monopoly in sequencing,” commented Dr. Smith, who, despite serving as a subject matter expert in BGI press releases,9 acknowledged that Illumina is still “the best platform on the planet right now.” But, added Dr. Smith, the company’s customer service is not up to par: the company treats its customers poorly and “charges a fortune for reagents.”
“The length of time between R&D and clinical application is shrinking,” observed Dr. Smith, although he reiterated that the main barrier to adoption of widespread sequencing is still a legal matter involving patent ownership. But, he maintained, the competing companies “can still make money even when they are suing one another.”
Building Consensus via a Patchwork of Orthogonal Tools
Sources have pointed out that relying so heavily on a particular sequencing approach or tool could introduce bias,13 so orthogonal approaches are probably preferred. In fact, many studies seem to use a hybrid approach to sequencing and the interpretation of results, suggesting perhaps that the companies are not competing per se, but are complementary to one another.14–16
Even Illumina recognizes the importance of making its technologies compatible with other companies’ offerings. “We recognize we can’t do everything,” Kevin Meldrum, senior director of product marketing at Illumina, told GEN. He says the market players were in a “friend or foe” battle approximately two to three years ago. “We struggled with that,” confided Meldrum, until the time came when the company realized it was really “about building an ecosystem.”
Rather than see other companies as competitors, Meldrum said, each company should “embrace [other] companies more proactively.” He added that he envisions Illumina functioning much like Microsoft, a market leader that provides support for developers in the space. For example, Illumina has proactively partnered with several companies (including PerkinElmer and Eppendorf) to bring automation to sample prep. “It’s not our job to pick the winner,” quipped Meldrum, emphasizing that it is the company’s goal to make Illumina technologies compatible with multiple different sample prep vendors.
In January 2018, BGI announced it will purchase 10 more PacBio Sequel systems (following the order of just one tool in 2016).17 Also at the beginning of 2018, Illumina announced a partnership with Thermo Fisher Scientific to make AmpliSeq chemistry compatible with Illumina’s sequencers—and as a term of the agreement, Illumina is even allowed to sell the product to vendors directly (although only to the “research use only” market).18
According to Dr. Baker, “AmpliSeq, an assay for targeted panels, was Ion’s main advantage over Illumina. This makes it look like Thermo has given up the NGS race, recognizing that they can probably make more money selling reagents for Illumina boxes.”
Meldrum adds that for clinical applications—such as Illumina’s partnership with Amgen to develop companion diagnostic assays—clients generally want the whole workflow to consist of Illumina products. But in open research situations, principal investigators have expressed a desire to have more flexible, agile setups.
But Dr. Özdemir worries that even if researchers select the appropriate sequencing approach, and a complementary suite of technologies, the issue of what investigators are able to conclude from that data—that is, what actionable measures clinicians can take when armed with the genetic code—remains. “I find many of the WGS stakeholders are mixing up—conflating—analytical validity with clinical validity or utility. Even if we can sequence with great accuracy, that does not mean invariably that we can predict the disease or treatment outcomes with robust performance and clinical validity.” He added that influences of the environment on the genome are often omitted from WGS discussions.
Improvements in genomic literacy are certainly a necessary part of the story and should be developed in tandem with improvements to sequencing technologies. Concluded Dr. Özdemir, “In 5–10 years, WGS is going to be routine—and will be dominated by platforms that are viable, versatile, accurate, and inexpensive—but we need more head-to-head comparisons of new technologies for clinical validity and utility.”
Part I of a two-part series, for Part 2 click here.
Come Together, Right Now—Over Seq
In the lab, investigators are using a patchwork of sequencing tools to get desired results.
For example, consider the “transcriptional landscape” of Saccharomyces cerevisiae—assembly of the reads came from three separate tool makers. The genome was pieced together using long reads from ONT and PacBio that were corrected with short reads using Illumina tools.1
In one study aimed at sequencing the yeast genome, researchers used three different platforms (PacBio, MinION, and Illumina’s MiSeq) to concoct a picture of the genetic code.2 They compared a long read to gain consensus on their short-read stitching, and determined the assemblies with the highest accuracy were those produced by ONT or PacBio tools and corrected using MiSeq reads.
A similar effort, wherein the long read from PacBio was patched with information obtained through an Illumina short read, was recently used to piece together the full wheat genome (Triticum aestivum).3
Researchers were able to identify approximately seven-fold more structural variation (than was found with high-coverage whole-genome sequencing [WGS] alone) by using a combination of orthogonal tools: high-coverage Illumina sequencing (short-read WGS), BioNano Genomics’ optical mapping, 3.5 kb and 7.5 kb jumping libraries, and long-read sequencing using PacBio’s SMRT long-read sequencing platform.4 Thus, those study authors concluded that to achieve excellent sensitivity in the identification of variants, more than one detection algorithm and more than one orthogonal technology would likely be required.
1. P. Jenjaroenpun et al., “Complete Genomic and Transcriptional Landscape Analysis using Third-Generation Sequencing: A Case Study of Saccharomyces Cerevisiae CEN.PK113-7D,” Nucleic Acids Res. (January 13, 2018).
2. F. Giordano et al., “De Novo Yeast Genome Assemblies from MinION, PacBio, and MiSeq Platforms,” Sci. Rep. 7(3935), (2017), doi:10.1038/s41598-017-03996-z.
3. A.V. Zimin et al., The First Near-Complete Assembly of the Hexaploid Bread Wheat Genome, Triticum Aestivum,” GigaScience 6(11), 1–7 (November 1, 2017).
4. PacBio, Press Release, “Scientists Deconstruct Cancer Complexity through Genome and Transcriptome Analysis,” accessed January 22, 2017.
2. National Human Genome Research Institute, “DNA Sequencing Costs: Data from the NHGRI Genome Sequencing Program (GSP),” accessed January 19, 2018.
3. M. Jain et al., “Nanopore Sequencing and Assembly of a Human Genome with Ultra-Long Reads,” Nat. Biotechnol. (published online January 29, 2018), doi: 10.1038/nbt.4060.
4. Tweet from Clive Brown (@Clive_G_Brown), accessed on January 19, 2018.
5. National Human Genome Research Institute, “The Cost of Sequencing the Human Genome,” accessed January 19, 2018.
6. Press Release, “Illumina Releases NovaSeq S4 Flow Cell and NovaSeq Xp Workflow,” accessed January 19, 2018.
7. E. Birney et al., “Genomics in Healthcare: GA4GH Looks to 2022,” bioRxiv (bioRxiv preprint first posted online October 15, 2017).
8. Thermo Fisher Scientific, Press Release, “Thermo Fisher Scientific Customers to Showcase Innovations in Precision Genomics Research for Inherited Disease and Reproductive Health at ASHG,” accessed January 22, 2018.
9. BGI, Press Release, “BGI’s MGI Tech Launches Two New NGS Platforms,” (October 31, 2017), accessed January 22, 2018.
10. Thermo Fisher Scientific, “Thermo Fisher Scientific Introduces Ion GeneStudio S5 Series, A Line of Highly Versatile Next Generation Sequencers,” accessed January 22, 2018.
11. Stratos Genomics, Press Release, “Stratos Genomics Raises Funds to Ready for Commercialization,” accessed January 22, 2018.
12. J. Karow, GenomeWeb.com, “Roswell Biotechnologies Harnesses Molecular Electronics for Chip-Based DNA Sequencing,” accessed January 8, 2018.
13. S. Goodwin, J.D. McPherson, and W.R. McCombie, “Coming of Age: Ten Years of Next-Generation Sequencing Technologies,” Nat. Rev. Genet.17(6), 333–351 (May 17, 2016), doi: 10.1038/nrg.2016.49.
14. M.J.P. Chaisson et al., “Multi-Platform Discovery of Haplotype-Resolved Structural Variation in Human Genomes,” bioRxiv, (bioRxiv preprint first posted online September 23, 2017).
15. Y. Mostovoy et al., “A Hybrid Approach for De Novo Human Genome Sequence Assembly and Phasing,” Nat. Methods 13, 587–590 (2016), doi:10.1038/nmeth.3865.
16. J. Quick et al., “Multiplex PCR method for MinION and Illumina Sequencing of Zika and Other Virus Genomes Directly from Clinical Samples,” Nat. Protocols 12, 1261–1276 (2017), doi:10.1038/nprot.2017.066.
17. Pacific Biosciences, Press Release, “BGI Increases Long-Read Sequencing Capacity with Purchase of 10 PacBio Sequel Systems,” accessed January 25, 2018.
18. Illumina, Press Release, “Thermo Fisher Scientific and Illumina Sign Agreement to Provide Research Market Broader Access to Ion AmpliSeq Technology,” accessed January 22, 2018.