Successful Study Design
As stated earlier, the importance of proper study design cannot be underestimated, and it is useful to discuss some of the keys to successful design. Before designing a study it is often advantageous to consult a group of specialists such as clinicians, proteomics researchers, mass spectrometrists, and biostatisticians.
In particular, biostatisticians should assist in the planning of data-analysis strategies by calculating the number of samples required for statistical relevance and helping to avoid data-analysis pitfalls such as overfitting data to a model that may not be representative of the broad set of data and false discovery of biomarkers due to random chance.
Successful biomarker-discovery studies start with a clear, narrowly defined clinical question. Broad questions can introduce more variables and thus be more complex to validate. The clinical question should specify a measurable result of clinical utility and aim to yield results that improve current diagnostic, prognostic, or therapeutic methods. Two types of studies are generally used.
Retrospective studies use samples from a bank for which the clinical outcome is already known and rely on information collected by questionnaire, case records, or sample banks. Prospective studies monitor the progress, symptoms, and disease development of a selected set of patients.
The expertise of a clinician should be used to determine the appropriate sample set and clinical performance requirements for accepting and adopting the findings of any resulting clinical assay.
The next step in study design is sample selection. While human patients ultimately represent the most accurate model for clinical studies, nonhuman models display less biological variability and allow for experimentation. Sampling size should be determined by estimating the number of samples required to attain statistical relevance. Appropriate controls should be included in the sample sets. It is seldom effective to simply compare data from a group of diseased individuals only to a group of healthy ones. Controls matched for such characteristics as age as well as samples from patients with other diseases with similar clinical profiles can improve the relevance of the data.
Proper sample collection, handling, and storage are then required to produce robust biomarker discovery results. Multiple collection sites should be used to minimize systematic bias, and prolonged storage at 4°C or repeated freeze-thaw cycles should be avoided. Standardized protocols that maintain consistency in the timing of sample collection, equipment and reagents, and methods and timing of all processing steps are essential.
Careful planning of the overall experimental design is the last component of study design and will cover all aspects of the biomarker study, from selection of the appropriate proteomics platform to assay design, fractionation techniques (Figure 2), data collection (instrument settings), and analysis. An effective design ensures that the clinical question is answered, all sources of analytical bias are minimized, and the predictive value of the resulting biomarkers is tested. The workflow of each phase of the study must be defined as well as the timing of the phases, which may require several iterations and optimization.
Experimental Design for SELDI
Successful experimental biomarker study design requires selection of the appropriate proteomics platform. One platform for biomarker discovery is surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS), which combines the separation power of chromatography and high-sensitivity mass spectrometry.
SELDI technology meets the biomarker discovery challenge by providing the sensitivity required to reproducibly detect low-abundance proteins in the presence of many high-abundance ones, while also providing the throughput required to analyze the enormous number of samples required to validate candidate biomarkers and ensure their clinical utility.
The ProteinChip® SELDI system from Bio-Rad Laboratories (www.bio-rad.com) utilizes arrays that selectively bind and retain whole classes of proteins from complex samples. The mass profile of the bound proteins is then determined directly from the arrays using TOF MS, creating protein profiles of molecular mass versus peak intensity. SELDI is one of the few platforms that can analyze hundreds of proteins in thousands of samples in a timeframe commensurate with the demands of a clinical proteomics biomarker study.
As with any technique applied to biomarker studies, successful application of the ProteinChip SELDI system for discovery requires careful experimental design. The design should include optimization of procedures used for sample preparation, selection and processing of arrays, data acquisition with a focus on optimizing laser energy, and data analysis. Different array types and wash conditions generate different profiles from the same sample, and combining these conditions yields a much broader picture of the proteome (Figure 3).
Sample preparation is a potential source of analytical bias that is often overlooked and underemphasized. Consistent and appropriate liquid-handling techniques as well as defined protocols for the initial processing of samples are essential for obtaining reproducible results. Sample types such as serum and plasma are highly complex, and though addition of any sample-handling step increases chances for variability, fractionation of these sample types prior to SELDI analysis increases the number of protein peaks detected and improves detection of low-abundance proteins.
Array processing is also a key to success with SELDI, and careful thought should go into sample layout on each array, optimizing sample dilution and buffer composition and standardizing the methods for application of the matrix to the arrays. Array preparation for a single condition (one fraction, array chemistry, and matrix combination) and data collection on this condition should be completed before continuing to the next condition.
Proper data collection and analysis are essential to successful biomarker discovery with SELDI. Qualification and calibration of a SELDI-TOF MS system should be done regularly to ensure optimum performance; the manufacturer provides kits for this purpose. Data-acquisition parameters should be tested and optimized on a pool of experimental samples before collecting data from study samples.
The collected data can first be processed using the system’s default processing parameters, then reprocessed later if necessary. Statistical tests are generally used to screen for peaks that show significant differences between clinically relevant groups using either univariate or multivariate statistical techniques, and care should be taken to avoid false discovery and overfitting of multivariate models.
The application of proteomics to discover clinically meaningful biomarkers has proven to be challenging and has so far met with only limited success. However, the combination of a coherent, rigorous, and comprehensive process from study design to clinical assay implementation with the ProteinChip SELDI technology will help meet the challenge of discovering biomarker panels that could be used to accurately detect and predict human disease states, customize disease treatment, and assist in all phases of drug development.
Enrique A. Dalmasso, Ph.D., is senior staff scientist at Bio-Rad Laboratories. Web: www.bio-rad.com.
A more detailed discussion of the guidelines for effective biomarker study design can be found in Biomarker Discovery Using SELDI Technology: A guide to successful study and experimental design (Bio-Rad bulletin 5642).