February 15, 2008 (Vol. 28, No. 4)
Vicki Glaser Writer GEN
Accuracy, Efficiency, and Flexibility Must Be Considered from Sample Prep to Data Analysis
As the popularity of large-scale SNP genotyping for searching whole genomes or subsets of genetic loci for single-base variations increases, researchers often rely on core facilities to process research and clinical samples. Advances in genotyping instrumentation and reagents, automated sample-processing tools, as well as data-management and analysis strategies have increased the complexity, reliability, robustness, and range of operations of modern systems.
Today’s genotyping platforms enable greater flexibility, reproducibility, and analytical capability. There are, however, many challenges and pitfalls. Mistakes made in purchasing and implementing genotyping systems can be costly in terms of dollars, time, and labor. It can also lead to poor-quality data output as well as wasted samples and reagents.
Directors of major core facilities and genotyping centers in academia nationwide tend to agree on several key principles related to the establishment of a core genotyping lab: Do your homework, understand how different platforms work, visit an existing core facility and talk to the staff, ask for advice, and know your clientele in terms of samples they will provide, services they expect, and how the data will be managed and analyzed.
Experienced staff point out the need to consider infrastructure and process requirements of the genotyping platform. This is crucial both upstream during sample prep and DNA extraction as well as downstream for data output, storage, and analysis. Other important considerations include bar-coding, sample tracking, and quality control. Also, keep in mind a robust LIMS for DNA/sample archiving as well as managing workflow, projects, and client information.
Building with Flexibility
It is critical for a core facility to employ highly multiplexed technologies to help clients obtain the information they need and interpret data output, according to Don Baldwin, Ph.D., director of the Molecular Diagnosis and Genotyping Facility at the Hospital of the University of Pennsylvania. The lab is now merging with UPenn’s microarray facility to support multiplexed, high-throughput SNP genotyping.
“Our clients are appreciating more and more the need to cover as many loci as possible that might be associated with a phenotype of interest,” says Dr. Baldwin. Researchers also do not want to limit the focus to a particular SNP simply because it appears in the literature, he adds.
The GenoSeq Core at UCLA provides sequencing and genotyping services including SNP and microsatellite to the UCLA community and outside academic and commercial clients. For large-scale projects, UCLA maintains separate microarray-based cores. The GenoSeq Core does not have high-throughput, array-based instruments. It instead focuses on projects ranging from single SNPs, like for diagnostics and orphan disease work, to studies including a few known SNPs, discovery work, and moderate-throughput genotyping for mapping candidate gene regions.
Ultimately, the question is not which is the best instrument but which is the best instrument for a given lab or facility. “A core facility does not necessarily need the latest bleeding-edge instrument,” remarks Jeanette Papp, Ph.D., director of the GenoSeq Core. “It needs technology that is robust.” An instrument that has more than one function can offer a facility more flexibility in workflow. With greater flexibility, the site can more readily adapt to the evolving nature of research, technology, and applications.
Connie Zhao, Ph.D., director of the the Genomics Resource Center at Rockefeller University points to flexibility in terms of the number of SNPs an instrument can process. The device should also allow for customized SNP microarray synthesis to support targeted gene approaches.
“Although many generalities may help you compare platforms, for any given SNP or panel of SNPs, there is no guarantee that one will work better than another,” adds Dr. Baldwin. “I learned to appreciate that, unlike using microarrays for gene expression analysis in which the data is ‘flavored’ by the platform and you select one platform to use, with genotyping you can mix and match platforms to get the best and most complete dataset. Whatever gets the job done.”
Key factors to consider while picking instruments and building a core facility include the ease of protocol like time, labor, and the need for physical separation of steps; analytical software tools to provide quality and speed of analysis; solid infrastructure in computing power, robotics, ancillary laboratory equipment, and personnel; and space needs. Another important consideration is to plan for future growth and to build with flexibility in mind to facilitate change and expansion.
Technical Training, Support, and Infrastructure
Factors that can effect overall operating efficiency extend beyond the actual instrumentation. Training, support, and infrastructure are the three most important things to worry about when building a core facility, according to Jonathan Woo, acting manager of the Genomics Core Facility at the University of California, San Francisco’s Institute for Human Genetics. Hands-on training by the manufacturer may include valuable tips and tricks that are not written in any manual.
“Training and support are huge, especially for maintaining consistency of protocols and data generation and in the integrity of data quality,” Woo notes. Dr. Zhao also agrees with the importance of these two elements. “Core facilities need to have a standard and to produce data reliably.” SNP call rates at the Rockefeller core are generally at least 99%, she adds.
As genotyping becomes a more commonly used clinical tool, technical support becomes even more critical, as clinical labs require 24/7 operation. They cannot afford down-time or troubleshooting problems.
“In academic core labs, the staff has to wear a lot of hats,” notes Shrikant Mané, Ph.D., director of the Microarray Resource at the Keck Foundation Biotechnology Resource Laboratory. “We spend a significant amount of time talking to people about their projects.” The staff also has to know how to operate the instruments, run the software, manage the data, develop a good working relationship with the user, and do the billing.
Regarding infrastructure needs, Woo emphasizes the importance of having computers and servers capable of managing the large amounts of data generated from large-scale genotyping projects. Data can reach up to several gigabytes per run and as much as a terabyte in a week’s time. If there is no place to off-load and store the data, there will be no capacity to accommodate new data.
Dr. Mané also emphasizes the importance of having a reliable, integrated system to track projects, samples, and chips/arrays along with client and billing information. This system should also be able to track inventory and purchases.
Realistic cost assessments are essential. Dr. Baldwin cautions those operating or funding a large-scale genotyping facility to avoid focusing only on an instrument’s cost/SNP. The real expense of running an assay will depend on the number of samples, the cost of labor, resources expended on sample prep, and hands-on time required to perform the assay.
There is a lot to consider when it comes to the commercially available SNP genotyping platforms, notes Dr. Baldwin. Even though one platform may have a lower advertised cost/SNP, the assay may require a more convoluted process with more opportunities for something to go wrong. In the long run, no data or faulty data is worse than more expensive yet accurate data.
“Whatever genotyping platform you invest in, its performance will only be as good as the weakest part of the entire pipeline,” Dr. Baldwin asserts.
For SNP genotyping applications, Dr. Mané suggests following the manufacturer’s recommendations as closely as possible. The few dollars saved by purchasing reagents or equipment other than those the manufacturer proposes or by altering the protocol may end up costing more in lost time, resources, and data. The main goal is to standardize the process as much as possible to minimize laboratory mishaps and to increase efficiency.
The Keck Laboratory at Yale recently purchased a high-throughput DNA sequencing system in anticipation of sequencing largely replacing SNP genotyping within a few years. “The digital output of DNA sequencing is more quantitative and reliable,” says Dr. Mané.
“Many genotyping projects tend to be two-tiered in that they require custom SNP mapping of the region of interest, which has been identified by use of genome level arrays,” explains Dr. Mané. “In these cases, it is extremely necessary for core labs to establish platforms with the ability to analyze project-specific, custom SNP panels that have been designed by individual researchers.”
From Sample Prep to Data Analysis
“You need to think about the entire process from beginning to end: where the DNA will be coming from, what ancillary equipment you might need to preprocess the DNA, the instrument itself, and the data interpretation software,” Dr. Papp points out. Many options exist for sample prep and DNA extraction. A core facility should offer to perform automated DNA extraction because “relying on clients to supply DNA can create problems,” according to Dr. Baldwin. “You will get samples of widely different quality and concentrations and often in different buffers.”
Encourage clients to strive for the highest quality samples possible, advises Dr. Zhao. The genotyping process itself is standardized, and the old saying, garbage in/garbage out is also true with microarrays, she adds. Dr. Zhao also emphasizes the importance of using good experimental design and carefully matching control and test samples when performing association studies based on demographic characteristics such as age, gender, and population.
If a core must contend with samples having a broad range of DNA quality and consistency, it would be beneficial to select an instrument that is more robust and forgiving of DNA quality and differences in consistency. For example, a platform with a broader detection range and a robust data calling algorithm can be used.
One must also take into consideration the robustness of the data interpretation software: how mature are the algorithms, how error-prone is the genotype calling software, does it analyze the data or simply spit out huge data files?
Some core facilities offer bioinformatics and data analysis services, and others provide analytical support to clients. Yet, another perspective holds that the researchers, who best understand their samples and systems and who will ultimately make the decisions regarding future research directions, should perform the data analysis.
“The core’s mission does not end with data production,” emphasizes Dr. Baldwin. It is critical for a core facility using highly multiplexed technologies to help the client interpret the data output, he notes. His lab works with UPenn’s clinical epidemiology and biostatistics group. “If the core itself cannot offer support, then it needs to have a partner to hand the data off to.”
Factors to consider in evaluating genotyping software products include user friendliness, proper and simultaneous management of multiple projects, and the ability to do a variety of analyses and look at different types of variation.
Advice from the Pros
Dr. Baldwin recommends drawing on the experience and expertise of existing core facilities. The Association of Biomolecular Resource Facilities is an organization that provides an online directory of core facilities in the U.S. and holds an annual meeting. One approach would be to hire a core facility to run a panel of test samples, ask to watch the process, and from that begin to learn what it takes to run a system and a facility.
“You don’t need every instrument on the market; there is a lot of overlap in throughput,” points out Dr. Papp. “Think about the types of projects you will be doing to select that right technology.”
The magnitude of a project is the most important characteristic: how many samples and how many SNPs. “Do a lot of due diligence,” she recommends, and differentiate between systems that are better for projects with a large number of SNPs and fewer samples, a small number of SNPs and many samples, or large numbers of both.
“There can be a lot of hidden costs and instrumentation needs that the sales rep does not tell you about,” Dr. Papp remarks. “Before making a purchase, talk to people who have the instrument in their lab.” They can also comment on the quality of a manufacturer’s training and technical support offerings.
There are other key factors to consider such as run time, cost-effectiveness within the workflow and infrastructure of your facility, the real throughput, idle time, whether you would need to have staff work overnight or on weekends to keep the system online and complete projects.
Also, do not overlook validation and quality control. These aspects become even more critical as SNP genotyping becomes a more widely used clinical diagnostic tool and linked diagnostic and therapeutic products emerge. Single SNP tests are already approved, and multiplexed tests are moving through the development and regulatory pipeline. “We are all curious to see how genotyping will become a clinically approved, FDA-regulated activity,” comments Dr. Baldwin.