Even with validated SNPs and a robust technology range, at the high-throughput level, storage and analysis of these massive datasets is still a point of contention.
“What to do with the volumes of data that come out of these chips and interpreting the data are going to be two of the most critical things to solve,” says Todd Dickinson, director of product marketing for (www.illumina.com).
“I wouldn’t say it is controversial, but it is a point of discussion,” adds Dickinson. “There are conferences that literally do nothing but study how people look at these data and how to perform statistical analysis in a proper way.”
Several companies already provide initial SNP-genotyping analysis as part of their platforms, but so far no definitive analysis standard has been set. SNP analysis still needs research, but as Jiang notes, “it is an area of opportunity as well.”
Analysis at the whole genome level may become increasingly important as researchers use high-throughput methods at the exploratory phase of their research programs. “A lot of labs start with whole genome SNP analysis, maybe 100K, maybe 500K, maybe even 1 million SNPs. The density just keeps going up,” Jiang adds. “Whether you are drawing the right conclusion is very challenging. That is what we call the bottleneck of the entire flow now.”
Previous high-throughput experiments with mRNA microarrays have laid some of the groundwork for SNP genotypes, but researchers are still demanding greater ease of data storage and visualization.
However, some researchers argue that the very nature of SNPs eases data analysis. Instead of a spectrum of values, such as mRNA quantification, SNPs provide researchers with data that is either there or not. Kevin Munnelly, senior director and business unit manager for genomic products at BioTrove (www.biotrove.com), notes that customers who use the company’s OpenArray™ NT Imager Genotyping System do not have problems with the data delulge that exist with other technologies and applications.
“You get three possibilities, three populations of data—not so difficult to deal with, even with the volume,” says Munnelly. “With SNP genotyping, if it is done correctly, all you really need to do is say it is either allele one, allele two, or both.” In one example, an experiment required 64 SNPs tested in a population of 25,000 patients. Munnelly reports that even in that case, where the data volume was so great, it was easy to handle.
As project datasets are getting larger and larger, one of the most prevalent underlying issues researchers face is cost. “People want accuracy, but they don’t want to pay lots of money,” notes Zemlo. A SNP genotype used to cost $1.00, but experts now say that the cost ranges from $.03–.35 per genotype.
Several methods to lower cost exist, including reagent reduction, automation, higher throughput, and education. “Most of our marketing efforts are in education to help people understand the real advantages of different genotyping systems,” says Munnelly, “Even though you may think of the reagent cost as being the most important thing, other factors, such as labor costs, are also important.”
Cost does come with a flipside. “Accuracy is becoming a more important factor in some cases than the economics of the assay,” notes Richard Eglen, Ph.D., vp and general manager of discovery and research reagents at PerkinElmer Life and Analytical Sciences (www.perkinelmer.com).
In some cases, accuracy is the first concern of researchers, whereas cost is secondary. One argument is that increased accuracy lowers cost in the downstream pipeline.