April 1, 2014 (Vol. 34, No. 7)
Nicholas Miliaras, Ph.D. ATCC
To establish a link between a specific disease and a genetic abnormality, researchers must first obtain tissue samples from affected patients. Nucleic acids are then isolated and sequenced.
Then, the data obtained can be used to identify susceptibility loci in families where sequence variants such as single nucleotide polymorphisms (SNPs) or copy number variants (CNVs) are co-inherited with the disease. The data must then be validated by comparison with sequences from unrelated individuals who have the same disease, as well as reference genomes from different populations, before a relationship can be established.
Although sequence variants are essential for understanding the genetic basis of diseases, they represent marker sequences, not actual mutations. Only about 1.5% of the human genome contains sequences that code for proteins. Thus, identifying actual mutations that affect protein function resulting in a pathogenic phenotype requires the sequencing and analysis of many genomes from unrelated individuals.
Both whole-genome sequencing (WGS) and exome sequencing (sequencing the expressed regions of the genome) make it possible to identify mutations in a relatively short time through next-generation sequencing (NGS) technologies. In addition, data from genome-wide expression, in vitro, and in vivo studies provide a framework for assessing the relevance of mutations and developing panels for targeted genetic diagnostics.
The rapid evolution of NGS technology has made it possible to sequence genomes from individual patients in the clinic and to use this information to both identify the genetic causes of disease and also determine the best course of treatment, based on how patients with a given genotype respond to drugs or surgery.
NGS versus Classic Sequencing
WGS, exome sequencing, and gene-panel-based resequencing are relatively new. Once DNA sequencing became possible in the 1970s, so did its potential as a clinical diagnostic tool. For example, automated sequencing enabled the first standardized diagnostic tests, such as targeted sequencing for BRCA1 mutations in breast cancer patients and their family members. Classic (Sanger) sequencing-based tests remain the diagnostic standard for a number of diseases and gene-testing panels today.
Jan Jongbloed, Ph.D., a laboratory specialist at the University of Groningen in the Netherlands, has compared the efficacy of gene-panel-based resequencing methods with established Sanger sequencing methods for the identification of inherited cardiomyopathy mutations. He found that these methods provide sequencing results of comparable quality. He also noticed, however, that NGS methods offer certain advantages.
According to Dr. Jongbloed, the primary advantage of NGS in the clinic is that more genes can be analyzed: “NGS methods provide greater depth than Sanger sequencing and allow us to analyze more genes in the clinic. Sanger depends on amplifying a region first using specific primers, an approach that has its disadvantages. NGS methods do not require this type of amplification.”
“Also, with Sanger, we were only able to analyze three or four genes at the same time compared with 55 genes using NGS methods,” adds Dr. Jongbloed. “This also allowed us to identify new mutations where we were unable to do so before, such as in the titin gene.”
WGS versus Exome Sequencing
Dr. Jongbloed’s clinic is one of eight in the Netherlands that provides comprehensive sequencing services including WGS, exome analysis, and gene-panel testing. About 75 patients are evaluated there each month; however, WGS is still relatively rare.
“We do about one entire genome per month,” Dr. Jongbloed says. “While we can sequence a lot of genes at the same time, some regions are more difficult to sequence than others, such as those with high GC content, and repetitive elements in certain sequences make them difficult to map.” Another issue he and other laboratory specialists face is determining whether certain mutations are indeed pathogenic. “For this, the only solution is to find other affected family members with the same mutation, which can be difficult for a rare disease.”
Dr. Jongbloed believes that WGS, exome sequencing, gene-panel-based resequencing, and Sanger all currently have their place in the clinic. “WGS is actually faster because we don’t need to enrich for the genes as we do for exome analysis. Also, WGS should be the method of choice for neonatal screening “where we don’t know what’s going on.”
Regarding exome sequencing, he states, “It isn’t necessarily faster, but it is cheaper. For now, we should focus on exomes because it is easier to understand the consequences.” Sanger sequencing will always have its place in the clinic, he thinks: “It is well established and [Sanger] panels already exist for many diseases…we will always need to use it to validate NGS results and to look in other family members.”
Dr. Jongbloed would like to see lower costs for NGS technologies and more collaboration in the field through a centralized facility. He believes that other applications, such as Ion Torrent (Thermo Fisher Scientific) sequencing, will help reduce the cost of NGS itself and ultimately replace Sanger.
Challenges in Clinical Genomics
Elizabeth Worthey, Ph.D., a director in the Genomic Medicine Clinic at the Medical College of Wisconsin and Children’s Hospital of Wisconsin, shares Dr. Jongbloed’s opinion that WGS is preferable to exome sequencing in general. “If you focus on the exome, nongenic regions aren’t covered. Also, the first exon and some parts of other exons of a gene are often not covered very well.” She also sees challenges in sequencing regions with high GC content and in designing probes for regions that are homologous to other regions of the genome.
“If finances weren’t an issue, everyone would do WGS,” Dr. Worthey says. “In some cases, clinics may resort to using WGS as a reflex test for exome sequencing. If the answer isn’t there, you go to WGS.” Regardless, Dr. Worthey remains optimistic that costs will come down as technology improves.
The Genomic Medicine clinic sees about 20 to 25 new patients each month and performs WGS for about one-third to one-half of them. The current cost is about $5,000 per patient for clinical exome sequencing and analysis, and it is about $17,000 per patient for WGS. “The costs will definitely come down for WGS. For example, the recently released Illumina HiSeq X Ten systems will provide individual centers with two or three times the capacity of what the largest centers in the world can currently process combined.”
Dr. Worthey sees the greatest challenge for clinical genomics in the interpretation phase. “People say that while the sequencing costs $5,000, the analysis costs $1 million. But that needn’t be true—not if the lab has a suitable clinical analysis tool in place.”
The area that needs improvement the most is clinical interpretation, explains Dr. Worthey: “For example, how do you differentiate between all these errors, polymorphisms, causal mutations, etc.? One way is to determine whether a variant has been identified as deleterious previously, or whether it has been seen in many different individuals with different clinical presentations.”
Dr. Worthey points out that there are lots of repositories where this type of data is maintained. “If somebody else has found the same mutation in a patient elsewhere, then that’s what you are looking for, but you won’t know if you don’t have access to their data.”
Ultimately, Dr. Worthey surmises, the problem is data sharing: “If you have been working on breast cancer for many years, are you going to want to put your data in someone else’s database? Ideally, we would develop something where we could share data instantaneously, or there would be a central repository where we could access the data.” But such a repository, adds Dr. Worthey, “would have to be updated frequently for the greatest impact.”
WGS results cannot be interpreted effectively, Dr. Worthey suggests, unless they are compared with all clinical data available from affected patients as well as those who may have the same disease in an early or unusual presentation: “Clinical presentation data can be one page or many hundreds of pages, and there will have to be efforts to better catalogue and curate the data for others to interpret it successfully.”
A major challenge for any diagnostic laboratory is ensuring that the data is correctly matched with the original sample and patient. For clinical genomics, this is particularly important, since samples change hands several times.
The tissue could be isolated in a hospital and sent to a separate core facility for nucleic acid isolation and sequencing, and the sequence could be analyzed and interpreted offsite by a bioinformatics team. Thus, a unique genetic label that can be assigned at the time of isolation and tracked at each stage of this process is highly desirable.
Sarah Ennis, Ph.D., head of genomic informatics at the University of Southampton, U.K., has identified 117 unique SNPs for the approximately 180,000 exons in the human genome. The SNPs that Dr. Ennis and her colleagues have identified are sufficiently varied that there is little chance that two individuals in a population of 100,000 could have the same SNP fingerprint. Still, Dr. Ennis notes that there is a somewhat higher frequency of such duplicates in the Han Chinese population (1 in 85,000).
While annotating the SNP data is challenging, perhaps the greatest obstacle in controlling data quality is communication. “We need multidisciplinary teams working together and discussing things, rather than working in silos,” remarks Dr. Ennis. “Also, we need to develop more and better wet lab functional models for genomic data. It’s very hard to just look at a histopathology report and pull out an answer.”
Interpreting Genomic Data
Although NGS has been available to scientists at large research institutions for some time, it did not enter the clinic until recently. Steve Lincoln, senior vice president at Invitae, a San Francisco-based company that develops genetic diagnostics, believes that the clinic represents not only an opportunity to increase the usefulness of genetic testing, but also the chance to reduce its costs.
“The biggest difference between clinical and research sequencing is how the data are interpreted. Many groups know how to do sequencing, and the platform—exome, panels, or whole genome—is not necessarily the biggest differentiator,” says Lincoln. “What matters is that the results are made understandable and actionable to doctors, and Invitae has developed tools to facilitate this.”
Laboratories must handle data interpretation, asserts Lincoln, because doctors need to focus on counseling each patient and making treatment decisions. If laboratories are to satisfy doctors, they will have to upgrade their capabilities: “There are increased costs in clinical versus research sequencing in order to deliver this diagnostic quality data that will inform decisions such as chemotherapy or surgery.”
“There has been a steady stream of improvement in the 25 years I have worked in this field,” Lincoln observes. “This will only continue and lead to an increase in quality over the next few years. During the same period, there will also be an increase in awareness of how to use genetic information in more ways that will benefit patients.”
Lincoln and Invitae are also big proponents of data sharing through a centralized repository. “Not only do we support it, we are helping to build it. Our business will succeed because we can deliver high-quality, actionable results based on widely accepted medical and scientific knowledge.”
Check Back in a Few Years
Lincoln says a better question to ask iswhat clinical genomics will look like in five years or even three. Current debates might focus on the benefits of WGS versus exome sequencing.
It is certain, however, that both NGS technologies and the management of data will only improve in the near future. The nature of this data will encourage scientists and physicians to work together across disciplines to provide easily interpretable results in the clinic.