January 15, 2006 (Vol. 26, No. 2)
Inter-relationships Have the Potential to Accelerate and Refine the Process
The discovery of novel protein biomarkers will revolutionize the diagnosis, treatment, and prevention of disease, but, in addition, upstream of the clinical arena, protein biomarkers have the potential to accelerate and refine the drug discovery process.
The promise of proteomics in clinical medicine rests on the hope that scientists will be able to identify specific markers, or more likely, panels of biomarkers, useful for detecting disease at an early stage, measuring its progress, characterizing the biochemical changes contemporaneous with disease, and monitoring the efficacy of treatment.
Beyond the initial focus on discovery of relevant and robust candidate biomarkers for clinical applications is the perhaps even greater challenge of biomarker validation and translationand the associated technology necessary to detect and quantify themfrom the research laboratory to the clinical laboratory, or more simply, from the bench to the bedside.
Although biomarker discovery and drug discovery are often viewed as divergent areas of investigation, through proteomics, they can actually converge. Not only can biomarkers identified through proteomics investigations be employed directly to target responders, monitor clinical response (including adverse events), and serve as indices of drug effectiveness, but also the insights provided by a comprehensive proteomics study can focus attention on previously unrecognized pathways that may be amenable to drug intervention. These insights are potentially of enormous value because proteins represent the overwhelming majority of todays drug targets.
In the future we can expect to see novel drug development arising out of proteomic studies that have been designed to unearth and characterize the functional role of disease-related proteins. Proteomics is therefore likely to play an increasingly important role in target identification and validation and, ultimately, in expanding the list of cellular targets.
Consequently, the challenges inherent to biomarker discovery and validation and to exploiting the value of biomarkers in day-to-day applicationswhether these involve diagnosing a disease, treating a patient, or identifying a potent new drug target and defining its role in a disease pathwayare similar.
These challenges include harnessing the diversity and complexity of the human proteome and understanding the processes by which many and varied proteins may derive from a single gene sequence due to an array of over 300 post-translational events, including the formation of splice variants and truncated forms.
These challenges also relate directly to the technologies, instruments, and methods available for protein separation and identification. A variety of technology platforms are available for protein analysis, including 2-D gel electrophoresis, (multidimensional) liquid chromatography (LC), and mass spectrometry (MS). The selection of a platform for a particular application becomes even more daunting when the options expand to include various combinations of these technologies and the growing number of mass spectrometric ionization and ion separation options.
Past the discovery phase, the translation of biomarkers from the bench to the bedside will also require the development of conventional immunochemical methods for performing validation studies, followed by the adoption of multiplexed assays that can be applied in a high-throughput, routine clinical setting. (While multiplexed assays are currently uncommon, they will undeniably become an increasingly important element of the practice of clinical chemistry.)
For the purposes of the discovery phase that is common to biomarker and drug target identification, well-defined methods and robust technologies for both protein separation and identification are available in the form of 2-D gel electrophoresis (combined with mass spectrometry) and LC-MS/MS.
These analytical platforms each have their own set of advantages and disadvantages, but in practice they complement each other nicely. When combined, they give fair representative coverage of the diverse classes of proteins occurring over the broadest possible dynamic range, but in reality, cutting deeper into the proteome to detect low abundance proteins is a perpetual challenge.
New products and approaches are always emerging, and most researchers must work at the cutting edge. The main challenge lies in balancing the need to develop and adopt innovative technologies with the imperative to stay focused on the biology and associated research goals.
2-D Gel Technology
Although not generally considered avant-garde technology, in reality, 2-D gel electrophoresis remains the best of all available protein separation approaches, and recent commercial refinements to the strategy have improved its potential. Discovery proteomics based on 2-D gel electrophoresis begins by separating the complex protein mixture present in a biological sampleusually a biological fluid or tissue specimenon an SDS-polyacrylamide gel (SDS-PAGE).
There are two discrete separation steps in this process, both driven by the application of an electric current. Separation in the first (x) dimension, isoelectric focusing, is based on protein charge (isoelectric point, pI); separation in the second (y) dimension, gel electrophoresis, is based on molecular weight.
A single 2-D gel can resolve thousands of proteins which, when stained, form a pattern of spots. A comparison of the intensity of individual spots yields information on the relative amounts of each protein in the sample. 2-D gels can be used to define both qualitative and quantitative changes in expression levels between biological samples, such as comparing healthy versus disease tissue samples or samples derived from the same patient at different stages of disease progression.
Once a mixture of proteins is separated in 2-D space, robotic devices can then be used to excise the protein spots of interest from the gel. While still resident in the gel plug, the proteins are then digested with a protease, usually trypsin, to yield a set of peptides unique to each protein. This mixture of peptides can then be analyzed directly by MS. Typically, MALDI-ToF MS is employed, but other options can be adopted.
Single-stage MS analysis generates a mass spectruma plot of mass-to-charge (m/z) ratios versus intensityand this is effectively a mass fingerprint of the principal protein component in the sample. A comparison of the experimentally determined MS spectrum to MS spectra generated in silico from primary sequence databases reveals the identity of the protein. Conventional 2-D gel-MS techniques provide fairly broad coverage of proteins with isoelectric points between 311 and molecular weights in the range of ca. 10100 kDa.
Frequently touted limitations of the 2-D gel-MS paradigm include the under-representation of proteins at the extremes of pI (very acidic or basic proteins), and poor representation of hydrophobic proteins, in particular those at extremes of the molecular weight range (low- and high-mass proteins). However, arguably the key limitation of the 2-D gel approach has been the generally poor reproducibility of gel runs, making it difficult to compare results across gels.
Therefore, with traditional 2-D gel technology, when comparing a patient sample to a control, or when comparing two patient samples, each sample would have to be run on its own gel and the gel patterns compared. Because of the intrinsic variability in the gel-preparation process, the same protein may not migrate to the same location on two different gels due to differences in gel composition or run conditions, and this presents problems when attempting to overlay gel spot patterns and determine which spot on one gel corresponds to the same protein on another.
We adopt a second-generation 2-D gel paradigm routinely in our laboratory called difference gel electrophoresis (DIGE). In its simplest form, the DIGE approach, now commercialized by GE Healthcare (www.gehealthcare.com), involves labeling two distinct protein mixtures with two different cyanine dyes, each of which fluoresces at a distinct wavelength. The labeled protein samples are then separated on a single 2-D gel. The size- and charge-matched dyes allow for co-migration of identical proteins.
DIGE reduces experimental variation and enables direct comparisons of protein expression between samples. A third cyanine dye can also be used to label a pooled sample, which acts as an internal standard to control for inter-gel comparisons and improve quantitative precision across gels. This mixture is run side-by-side with the various samples on every gel.
Fluorescent laser scanning using two or three excitation and emission wavelengths detects the signals emitted by the dyes and generates distinct images that can then be super-imposed and aligned through pixel matching. This approach minimizes experimental variability and leads to both qualitative and quantitative advantages.
In particular, the power of DIGE lies in its ability to reduce system variability and accurately represent differences in protein abundance, yielding statistically meaningful results that unmask true biological variability.
Sample Fractionation Strategies
Regardless of the analytical approach that is adopted, analysis of samples, such as serum and plasma, is complicated because of large quantities of several high-abundance proteins such as albumin. These proteins drown out other components and confound our ability to detect low-abundance proteins that may be of particular biological significance.
Despite the proven value of 2-D gel electrophoresis, the fact remains that without recourse to some form of prefractionation, gels of complex samples such as plasma saturate in some regions and the low abundance components are lost to the approach. Under the best of conditions only about 1,000 to 2,000 discrete protein spots are visible.
This represents only a small subset of the whole proteome and gives a biased view of the wide spectrum of proteins present in the samplea view skewed toward the high-abundance proteins. Devising methods and technologies that cut deeper into the proteome and expand the range of proteins available to interrogate represents a significant and critically important challenge.
A plethora of commercial work-arounds for this problem are now emerging, each aimed at unmasking low-abundance proteins. The most widely adopted of these are based on immunoaffinity chromatography designed to bind abundant protein moieties to a column, allowing for the elution and collection of a less complex protein mixture. We have carefully evaluated this approach and elected to employ other strategies that fractionate components but do not selectively remove the abundant compounds (and what they may be bound to).
For example, by dividing the pI range into several narrower intervals it is possible to expand the x-axis by several-fold without having to adopt a larger gel format. This improves resolution and routinely delivers a more comprehensive representation of the proteome.
Simply expanding the x dimension by employing narrow range pI strips helps, but is also ultimately limited in its potential. We are therefore adopting a third technology for fractionating protein mixtures based on multidimensional liquid chromatography prior to 2-D gel analysis. We employ an LC system incorporating multiple sequential columnsthe Ettan MDLC (GE Healthcare) to achieve reproducible separation of proteins based on more than one physical property. This approach offers the opportunity to tailor the separation strategy to the characteristics of each sample without being limited to pI and molecular weight as the sole separation parameters.
By adopting this approach we can improve our coverage of the proteome and gain insights into changes in expression at levels previously unobtainable. Further, because the entire strategy is based on the separation of intact proteins, important information about protein isoforms is available.
It is increasingly evident that different instrument platforms and approaches will have advantages for studying different types of samples and various subsets of the proteome. Nevertheless, 2-D gel electrophoresis remains one of the most powerful tools in the proteomics arsenal, especially when recent enhancements such as DIGE are adopted. Through these strategies, gel-to-gel variability can be minimized and quantitative precision is obtainable on intact proteins (and their isoforms).
Emerging approaches to sample prefractionation, including immuno-depletion methods and multidimensional LC, offer additional benefits because they allow researchers to cut deeper into the proteome to recover information that was previously unobtainable.
The importance of these emerging strategies in human biomarker and drug discovery studies will become increasingly evident as data accumulate on the role of specific protein isoforms in human health and disease. 2-D gels allow researchers to generate information on each distinct gene product present in a sample and identify unique proteins formed as a result of truncation, alternative splicing, and a range of post-translational modifications, including phosphorylation, glycosylation, and sulfation, which may transform a single gene product into multiple chemically and possibly functionally distinct entities.
While complementary strategies based on proteolytic cleavage early in the analytical process will remain indispensable tools, they will have to be employed hand-in-hand with 2-D gels or similar strategies to ensure that the biological complexity that distinguishes the proteome from the genome is brought to light.
For the foreseeable future, the development of protein biomarkers for use in drug discovery and in guiding diagnostic and treatment decision-making in clinical medicine will continue to benefit from advances in protein separation and detection methods and in the innovative use of combinations of diverse instrument platforms.