August 1, 2006 (Vol. 26, No. 14)
Mass-Directed Screening to Identify Low-Level Differential Expression in Complex Samples
Biomarker discovery has become one of the major segments of proteomics research in academic, government, biotechnology, and pharmaceutical laboratories. The interest in biomarkers is being driven by the anticipation that they can be used for the detection, diagnosis, treatment, monitoring, and prognosis of many different diseases (e.g., oncologic, cardiologic, and neurologic diseases). Biomarkers are also being used in the drug discovery process, especially in studies of the potential toxicity of new drug candidates.
Most proteomics methods used to discover biomarkers are based on differential protein expression. This experimental approach compares changes in protein concentrations between a control and a sample. The expectation is that changes in protein expression levels are indicative of up- or down-regulation of a protein and thus will provide significant cell pathway, signaling, and biomarker information.
Mass Spectrometry-based Methods
Historically, two mass spectrometry (MS)-based methods have been used to study protein differential expression in complex biological samples: 2-D gel electrophoresis (2DGE) combined with matrix-assisted laser desorption ionization mass spectrometry (MALDI-MS); and 2-D liquid chromatography (2DLC)/ MS/MS.
Each of these MS-based methods has advantages and disadvantages. The 2DGE approach does not separate very high or low in molecular weight or very hydrophobic or hydrophilic proteins. It is also not uncommon to isolate an apparently well-separated 2DGE spot that contains several proteins. This phenomenon makes the identification of differentially expressed proteins by MALDI-MS more difficult because the resulting mass spectra contain mixtures of proteins instead of a single protein, thus confounding identification.
The 2DLC/MS/MS method may require 24 hours of ion exchange and reversed-phase chromatography to separate a complex sample. Another disadvantage, the inability to detect low-abundance biomarkers, is a consequence of MS/MS scan cycle times. It may take several seconds to perform a complete scan cycle using some MS/MS systems. During this time the LC effluent continues to flow into the MS, and low-concentration peptide ions present in the initial mass spectrum may not be selected for MS/MS acquisition. The end result is that lower-level peptides (and their corresponding proteins) may not be identified in complex samples.
A Mass-directed Biomarker Discovery Workflow
A new MS-based biomarker discovery method uses a directed screening approach and incorporates two MS systems, a time-of-flight (TOF) MS system and an ion trap (Trap) MS system. The TOF MS is used to screen samples for peptide ions that are indicators of protein up- or down-regulation, when compared to a control. In the next step, a Trap MS system scans only the up- and down-regulated ions discovered by the TOF to identify the up- or down-regulated proteins in the sample. This mass-directed scanning strategy optimizes the scanning efficiency of the Trap MS system and leads to more efficient and accurate protein identifications.
After initial sample preparation, fractionation, and digestion, LC/TOF MS analyses are performed on the control and sample replicates. Typical LC separations using the mass-directed approach take between one and two hours. A profile of all ions detected in the sample and control is produced by the TOF MS. The profiles from the control and sample are compared using software capable of profile analysis, such as Agilent’s (www.agilent.com) MassHunter Profiling software, which performs several iterative analyses on the data to extract common features (e.g., retention time, mass-to-charge ratios) from each set of data. The software then produces a list of all features (sets of related ions) that are up- or down-regulated in the sample versus control.
The LC retention time window and the exact mass of the appropriate target ions are then transferred to the Trap MS system. The Trap MS system is directed to acquire MS/MS spectra on the up- and down-regulated peptides that were detected by the TOF. This targeted MS/MS approach ensures acquisition of MS/MS spectra of the highest possible quality from the targeted peptides, facilitating identification of differentially expressed proteins.
There are several instrument parameters that have to be controlled for this method to work successfully. The TOF should have high mass accuracy (2 ppm) and high resolution (>10,000) to ensure that the detected ions in the control and sample represent the same peptide species. It is also critical to ensure that the LC conditions for the profiling and identification experiments are reproducible and reliable. This can be accomplished by using a chip-based LC/MS interface that ensures reproducible flow conditions and improves sensitivity and reliability.
Examples of Mass-directed Discovery
An example of a mass-directed biomarker discovery is illustrated by the analysis of an E. coli lysate spiked with known amounts of bovine serum albumin and serotransferrin (Table). This model simulates the up- and down-regulation found in complex biological samples. E. coli lysate, bovine serum albumin (BSA), and serotransferrin were digested with trypsin and analyzed using an Agilent 1200 Series HPLC-Chip/MS system interfaced to an Agilent 6210 TOF mass spectrometer. The HPLC-Chip configuration included a 40-nL enrichment column and a 150 mm x 75-µm analytical column packed with ZORBAX 300SB-C18, 5-µm material. A 100-minute-long gradient method was used. The solvents employed were: (A) 0.1% formic acid in water and (B) 0.1% formic acid in 90% acetonitrile/water. After initial loading at 3% (B), the gradient stepped to 8% (B) at 0.5 minutes, then 45% (B) at 85 minutes, 80% (B) from 90 to 92 minutes and back to 3% (B) at 92.01 minutes. The analytical flow rate was 300 nL/min, and the sample was loaded at 4 nL/min.
Accurate mass data from the TOF mass spectrometer were extracted and evaluated using the Agilent MassHunter Profiling software. Targeted LC/MS/MS data was obtained using a HPLC-Chip/MS system interfaced to an Agilent 6330 Ion Trap LC/MS. Peptides were identified using the Agilent Spectrum Mill for MassHunter Workstation software to search the SwissProt protein database.
Figure 1a shows a total ion chromatogram for one of the spiked E. coli samples and demonstrates the complexity found in biologically based samples. Figure 1b depicts an extracted ion chromatogram (EIC) for m/z 504.2506 1.9 ppm, a singly charged ion from the serotransferrin digest.
The multiple peaks for this exact mass further demonstrate the sample complexity. An MS spectrum obtained at 9.2 minutes shows that the 504.2506 ion has a relatively low abundance (Figure 1c).
Most likely this ion would not have been chosen for MS/MS analysis using data-dependent logic in a typical MS/MS experiment, since many other ions in the spectrum have significantly higher abundances. This is the typical situation for many biomarkers and is the fundamental reason why many undirected MS/MS experiments fail to discover low-level protein biomarkers.
One of the major challenges in mass-directed biomarker discovery is the analysis of the TOF MS data. Thousands of ion profiles are collected from the control and sample during the LC/MS acquisition. To address the challenge of such a large data set analysis, algorithms were designed to automatically determine the features (sets of ions from the same peptide) at various retention times.
Next, the extracted features are compared between samples to determine the up- or down-regulated peptides in samples when compared to the control. Ultimately, a plot of up- and down-regulated features is produced (Figure 2). The MassHunter Profiling software identified 22 features (sets of ions) that were up- or down-regulated by 2x and 4x when compared to the control. This data demonstrates the ability of a mass-directed strategy to discover differential expression levels between controls and samples.
The next step in the mass-directed method for biomarker discovery is to use the retention time and mass information from the initial TOF analysis for target identification using a Trap MS. The TOF mass information was imported into the Trap MS control software as an "include" mass list for MS/MS acquisition. At the appropriate retention time windows, the ion trap obtained MS/MS spectra for the ions included in the imported mass list.
This retention time and mass-directed approach ensures that the Trap MS will collect MS/MS information for the up- and down-regulated peptides regardless of their relative abundance. In addition, this approach maximizes the acquisition duty-cycle and therefore the quality of the MS/MS spectra of the targeted peptides.
The MS/MS data were then processed using Spectrum Mill for MassHunter Workstation software to search the SwissProt protein database. The MS/MS data from this targeted identification strategy resulted in the correct identification of the BSA and serotransferrin peptides in Samples A and B (Figure 3).
Conclusion
Biomarker discovery using MS techniques requires sensitivity, mass accuracy, and reproducibility (Figure 4). The results of this study demonstrate that a mass-directed approach using TOF and ion trap mass spectrometers is a powerful method for identifying low-level, differential expression in complex samples. It also allows for the efficient, targeted identification of differentially expressed proteins by MS/MS techniques with greater efficiency than those based on traditional data-dependent scan techniques or 2DGE methos that are not amenable to high-throughput biomarker discovery.