October 15, 2016 (Vol. 36, No. 18)
CNV Detection and Analysis Tools Are Being Used To Fill Genetic Diagnostic Gaps
There is a growing recognition that genetic copy number variation (CNV) plays a significant role in understanding the full mutation spectrum for clinical diagnosis. Advances in several high-throughput technologies have revealed the impact and potential diagnostic benefit of CNV detection.
The wide-ranging effect of copy number variants on human health and the technologies used for copy number detection were subjects of interest at the recent European Society of Human Genetics conference in Barcelona, Spain.
The Full Mutation Spectrum
CNVs result from deletions and duplications that can range from the single-gene and exon level to much larger changes at the chromosomal level. Many of these genetic variants are now known to cause disease phenotypes. Swaroop Aradhya, Ph.D., head of genetic diagnostics and medical affairs at Invitae, recounted some of the critical, technological transitions that have made CNVs diagnostically significant.
“Genetic assays were originally developed for individual genes known to cause a specific pathology. A classic example is the intragenic deletion within the BRCA1 gene, known to be associated with breast cancer.” Dr. Aradhya explained, “Once the shift to next-generation sequencing (NGS) occurred, and panels started being developed, copy number analysis needed to keep up in a similar manner.” The use of NGS in conjunction with exon-focused comparative genomic hybridization (CGH) greatly expanded CNV analysis.
“For example, if we started with a NGS panel of say a hundred genes, we could create an array with the same number of genes and detect the copy number changes of those genes.” Since then, many algorithms have been developed that can calculate copy number directly from data generated from NGS. “We are now able to determine sequence variants as well as copy number variants using the same assay,” Dr. Aradhya said. Sequence and CNV analysis are now performed routinely to elucidate the full mutation spectrum for a given pathology.
“We now offer this testing for every gene on our menu. We feel that it is good clinical care to make sure that we can capture every type of genetic variant, so we can return this critical information to the clinician and their patients,” Dr. Aradhya added.
CNV Detection Technologies and Sample Quality
David Bonthron, Ph.D., professor of molecular paediatrics at the Leeds Institute of Biomedical and Clinical Sciences, addressed the various technologies surrounding CNV detection from a practical diagnostic viewpoint of sample quality. The most widely used method for copy number detection in medical genetics is either microarray-based, competitive hybridization between two DNA samples (patient DNA and control DNA) or measuring absolute levels of hybridization signal using a large number of probes across the genome. Both are subject to problems of sample quality.
“Quality is really determined by two things—the availability of a sufficient quantity of DNA and the state of fragmentation and degradation of that DNA,” Dr. Bonthron explains. This issue is particularly true for prenatal as well as postmortem settings, where CGH results tend to be less reliable due to limited or significantly degraded sample material. In comparison, NGS preparations are less sensitive to DNA degradation because samples are commonly sonicated down to small fragment sizes before sequencing.
“Although NGS may still not be the frontline method for performing copy number analysis, producing quality sequencing libraries from low-quality samples is often much easier than attaining high-quality array hybridization results form the same samples. However, there are trade-offs. Some regions of the genome are essentially ‘invisible’ to this particular diagnostic approach.
“For example,” Dr. Bonthron said, “we see gaps in the human leukocyte antigen (HLA) region of chromosome 6 because much of this region is represented by more than one copy in the reference genome. Therefore, some significant genes may escape detection though this method.”
Dr. Bonthron believes CNV sequencing fits well in clinical laboratory settings that already use NGS for diagnostic purposes. “It makes sense to streamline workflows where copy number analysis basically becomes another piece of sequencing within a larger sequencing workflow.” He anticipates that large-scale detection of CNV genome rearrangements will ultimately be performed by long-read sequencing technology. “Once we have highly accurate, long-read sequencing across the whole genome, we can then use computer bioinformatics to essentially determine the boundary points and reconstruct copy number variants de novo.”
CNV Detection and Prenatal Diagnosis
The prenatal clinical setting presents unique diagnostic challenges due to the limited nature of the samples and inherent time constraints. David Chitayat, M.D., head of the prenatal diagnosis and medical genetics program, department of obstetrics and gynecology, at the Mount Sinai Hospital in Toronto, explained, “When patients come to us with a fetal ultrasound abnormality, we have to act quite quickly to determine the phenotype. To provide an accurate prognosis, we must identify the specific genetic cause behind the condition.”
Prenatal genetic testing often requires the use of multiple technologies in concert to elucidate a greater mutation spectrum, including CNV detection. “Based on the fact that there are CNVs, comparative genome hybridization technology is a breakthrough that cannot be ignored. CGH provides much more information compared to regular karyotyping,” Dr. Chitayat added. He also cautions that there are limitations to hybridization panels due to the varying etiological nature of fetal abnormality. “We’re introducing more whole exome sequencing (WES) and whole genome sequencing (WGS) to arrive at a better conclusion about the etiology and determine the best way forward for prenatal or preimplantation genetic diagnosis.”
There is an ongoing technological continuum where time and cost constraints dictate the use of various methods. Several companies provide expedited sequencing services, but at a relatively high cost. These methods may become more common once shorter turnaround times and lower investment cost per genome are achieved.
“In terms of future directions, whole genome sequencing will be a natural, technological continuation that may replace microarrays, whole exome sequencing, and karyotyping. It may become a one-stop solution, able to identify copy number variants, single-gene mutations, as well as balanced rearrangements,” Dr. Chitayat concluded.
CNVs, Autism, and Sociability
Stephan Sanders, Ph.D., assistant professor at the University of California, San Francisco, School of Medicine, has been elucidating the role CNVs play in autism, describing CNVs as, “one of the most unmapped, ‘here be dragons’, areas of the genome.”
The significance of CNVs may be underappreciated. Citing previous work, Dr. Sanders stated, “With a relatively small sample size, we found a tenfold increased risk of de novo copy number variants.” To elucidate a more complete mutation potential for autism spectrum disorder (ASD), the Simons Simplex Collection was employed. This special sample cohort is composed of autistic individuals uniquely affected within a family with no prior history of the syndrome. WES and chromosomal microarray (CMA) were applied to the collection to determine coding mutations and to detect CMVs, respectively.
“Data generated from this group revealed regions in the genome with a marked increase in de novo copy number variants, such as a region at location 16p11.2, now named for causing autism,” he explained.
Dr. Sanders described another region of interest also found to have high numbers of de novo CNVs. A deletion in the 7q11.23 region of chromosome 7 causes Williams syndrome, which is characterized by an increase in sociability. Those afflicted tend to have “cocktail party” personalities.
“Conversely, we have found that duplications in this region are associated with autism, which is known for a decrease in sociability,” added Dr. Sanders. This finding suggests that something as innate to our personalities as our level of sociability may essentially be coded by the amount of DNA in a particular region. “CNV has been useful as a signpost in showing potential routes toward gene discovery. It has identified some very important syndromes and causes of autism. Going forward, the most exciting part may very well be what it tells us about the regulating, noncoding genome.”
Long-Read Sequencing and Structural Variation
Evan Eichler, Ph.D., professor of genome sciences at the University of Washington School of Medicine, uses long-read, single-molecule sequencing (SMS) to understand structural variation throughout the genome. Most labs currently use short-read sequencing technologies as an indirect method to detect genetic variation.
“SMS technology is fundamentally different in that it allows us to detect structural variation and to assemble structural variants directly,” Dr. Eichler explained. Sequence reads of over 15,000 base pairs in length are routinely generated, with the longest reads reaching upward of 90,000 base pairs. Reads are subsequently mapped back to a reference genome, or assembled de novo, enabling an entire sequence reconstruction of the actual structural variant.
“Structural variation tends to be prevalent in repetitive regions, which are inherently difficult for short-read technology to sequence. SMS technology produces sequence reads of sufficient length to traverse these repetitive regions completely, uncovering thousands of novel structural variants and millions of base pairs of previously undiscovered genetic variation.”
According to Dr. Eichler, “SMS technology is also clinically advantageous in that it’s capable of sequencing though GC-rich regions of DNA. Short-read technologies tend to be biased against CG composition. Biomedically important loci, such as the fragile X locus, which contains hundreds of CGG repeats, can now be sequenced and assembled. There is a growing realization within the diagnostics community that many important clinical loci can now be ascertained due to the advent of long-read sequencing technology.”