In the post-genome era, the use of molecular biomarkers is becoming de rigueur for R&D programs. The biomarker testing market has a proven record of revenue generation ($612 million in 2007) and is estimated to have an annual growth rate of 23.5% based on currently available biomarker assays. Thus both need and return on investment are driving the development of new technology and faster, more cost-effective biomarker development workflows.
Proteomic technologies have, in the past, been used successfully for biomarker discovery projects, producing many candidate protein biomarkers. Further verification work, however, is typically limited by the small number of proteins for which there are commercially available assays (about 500 human proteins). Using traditional antibody-based approaches it could take years and cost millions of dollars to develop assays for all of the candidate protein markers.
NextGen Sciences devised a workflow for development of protein biomarker assays, overcoming the bottleneck with the antibody-based approach (Figure 1).
Biomarker discovery platforms must be able to detect a large number of protein species in the sample type of interest and at the same time provide the protein information necessary for assay development. A platform that meets these criteria is commonly referred to as GeLC-MS (Figure 2); a combination of SDS-PAGE for protein separation and fractionation and liquid chromatography-tandem mass spectrometry (LC/MS/MS) for detection and quantitation of proteins via proteolytically derived peptides.
The process involves quantitative characterization of the proteome of a biological sample. Sample preparation methods vary depending on the nature of the sample. The proteins in each sample are then separated by SDS-PAGE. Each lane on the gel is then segmented (typically 24–40 segments) before enzymatic digestion. Peptides from each digest are analyzed by mass spectrometry (LC/MS/MS). The top graph of Figure 2C shows a typical chromatogram for a one-hour LC gradient, while the bottom graph shows an example of MS/MS (product ion) data for one peptide.
Approximately 50,000 spectra corresponding to 25,000 unique peptides are typically detected per lane. The data is then searched against appropriate protein databases; after false discovery rate analysis is performed a nonredundant list of proteins is compiled using commercially available tools. The end product includes a list of proteins identified with quantitative data for each protein in the sample. Statistical analysis is applied to the quantitative data providing fold changes across groups, coefficient of variance, p-values, principle component analysis, and hierarchal clustering.
The GeLC/MS output includes data about the peptides matched to each protein and their isoforms present in the sample. Peptide information includes amino acid sequence, product ion data, chromatographic performance, and ion intensity. The data from the GeLC-MS platform not only provides the quantitation needed to select putative biomarkers but also provides the empirical peptide information required for the assay development.