Sequencing seems cheaper by the day, so much so that it’s increasingly rare that scientists set out to do genetic and genomic research using arrays. However, given the availability of new and improved products and experimental approaches, some researchers are—preferentially—choosing chips.
The technology that was slated to make microarrays obsolete has actually helped make them better. Case in point: new exome arrays, like those available commercially from Affymetrix and Illumina.
The sensitivity and specificity reported for such chips could not have been possible without extensive sequencing work, which greatly informed their design. Considered an intermediate between standard genotyping arrays and exome sequencing, exome arrays present a cost-effective approach for studying large sample sizes in search of common variants, down to the singleton level.
“Exome arrays are a cost-effective way to study protein-coding variation and its contributions to disease,” says Goncalo Abecasis, Ph.D., professor of biostatistics at the University of Michigan School of Public Health. “Most protein-coding variants are too rare to examine with standard GWAS [genome-wide association study] arrays—even after imputation—and, while exome sequencing should provide a more complete look at these rare coding variants, it is still significantly more expensive.”
Bridging a Gap
Dr. Abecasis is one of several scientists from a handful of institutions working collaboratively to design and evaluate exome arrays. They’re basing this work on published variants derived from past sequencing studies involving more than 12,000 individuals.
“I see exome arrays as bridging a gap between genotyping-based association studies of common variants and sequencing-based studies of rare variants,” Dr. Abecasis says. “Studies that aim to survey protein-coding variation in very large numbers of individuals are very well-suited for exome arrays,” he adds.
The University of North Carolina at Chapel Hill School of Medicine’s Karen Mohlke is using these chips for such a study. Along with her colleagues, Mohlke, Ph.D., associate professor of genetics, is examining exome array data from thousands of individuals in search of low-frequency coding variants associated with metabolic traits.
In a February Nature Genetics paper, Dr. Mohlke and her colleagues identified low-frequency coding variants associated with fasting proinsulin concentrations at the SGSM2 and MADD GWAS loci as well as three new genes—TBC1D30, KANK1, and PAM—associated with fasting proinsulin or insulinogenic index in a cohort of 8,229 nondiabetic Finnish men. A corollary conclusion accompanied the researchers’ results: “This study demonstrates that exome array genotyping is a valuable approach to identify low-frequency variants that contribute to complex traits,” Dr. Mohlke et al., wrote.
“We did expect a large sample size would be needed to see any significant evidence of association with low-frequency variants,” Dr. Mohlke tells GEN. “The low-frequency variants had not been looked at previously in genome-wide association studies, and some of those could not be imputed well, even from 1,000 Genomes-based reference panels, so there was certainly an opportunity to see new associations.”
Genotyping array analysis is fairly straightforward, giving users an edge over examining raw sequence data for certain studies.
Further, the researchers chose to use exome arrays in part because they cost less. The chips, Dr. Mohlke says, proved “cost-effective, compared to sequencing, at the time we were doing this study.”
And today? “I think still it is cost-effective for analyzing low-frequency variants, at least the ones that are known,” she says. “When variants are known and on the array one wants to test, then it is much more cost-effective than sequencing.”
But for unknown variants or those associated with an ancestry group not well-represented on exome arrays, Dr. Mohlke suggests taking another approach. While powerful, exome arrays offer “an incomplete assessment of coding variants,” she says. “In the long term, sequencing still has the opportunity to identify many more variants.”
Michigan’s Dr. Abecasis echoes these sentiments. “There are certainly lots of settings where these arrays don't make sense,” he says. “As one obvious example, these arrays are no match for exome sequencing when it comes to studying the contribution of de novo mutations and other very rare variants to Mendelian disorders.”
Because particularly rare and population-specific variants often do not make the design cut, exome arrays “only represent 70 to 80% of the variation in any one exome,” Dr. Abecasis adds.
As a result, Dr. Mohlke at UNC suggests weighing cost vs. array content from the get-go. “If I were studying a population group for which the lower frequency and rare variants were not included on this array I would either add them to the existing array or design a new one, if that were technically [feasible] and cost-effective to do,” she says.