March 1, 2009 (Vol. 29, No. 5)

N. Leigh Anderson

Improved Instrumentation and Unbiased Samples Renew Promise of Biomarker Pipeline

Given that proteins are the primary working parts of cells, it seems self-evident that proteomics should yield abundant clues to disease mechanisms, as well as numerous clinically useful biomarkers. Biomarkers in readily accessible bodily fluids such as plasma, in particular, offer the potential for rapid advances in patient care through early diagnosis, selection and monitoring of treatment, as well as acceleration of drug development.

It is therefore surprising that the rate at which new protein diagnostics have been approved by FDA over the past 12 years has steadily declined. In fact no protein biomarkers arising from proteomics appear to have entered broad clinical use so far. Why is this?

From the viewpoint of basic biology, biomarker development is probably as difficult as drug development. Both aim at discovery and verification of a disease-related biological invariant across an outbred human population, and they seem to have similar candidate attrition rates. Given the 100:1 revenue advantage of pharma compared to protein diagnostics, one is tempted to imagine that biomarker work should nevertheless be far less expensive and take less time than drug development. This wishful thinking is just that.

Antibodies, a key component in protein diagnostics, are critical for demonstrating the presence and subcellular location of proteins.

SELDI Fiasco

Not so long ago (~2002), finding new cancer biomarkers in serum was made to look easy by applying an astonishingly simple new proteomics platform to a few samples from diseased patients and samples from a few healthy controls. This approach, commonly referred to as SELDI, combined three novel technology components, each of which is now known to be problematic.

The result was a general failure due to biases in the data (due in this case to machine drift between runs of cases and controls, but in other cases to sample processing and/or patient group selection). As a result, when analyses are repeated at other sites candidate disease patterns fail to replicate. Despite the efforts of dedicated biotech companies, and two of the largest clinical reference laboratories, SELDI tests for cancer have still not gained FDA approval.

While the reasons for this debacle are now well-understood and useful elements of the approach redeveloped in more rigorous form, clinical proteomics is only now recovering from the “SELDI bubble” caused by the initial excitement over this approach. Fortunately substantial parallel advances have been made (albeit with far less hype) in understanding critical sample requirements, in improving the performance of advanced MS (mass spectrometers) instrumentation, and in understanding the appropriate structure for a real biomarker pipeline.

It is difficult to overstate the importance of samples and experimental design in the operation of a biomarker pipeline. Two major factors arise: quality of the samples and number of samples.

High-quality samples are collected in such a way that there is no difference in collection or processing between groups: i.e., no bias. The typical number of samples required to convince diagnostic professionals that a biomarker is likely to have clinical utility (the Zolg number) is about 1,500, and technology platforms that cannot analyze this number of samples are not really usable in the later stages of biomarker verification.

Enhanced Equipment

On the instrumentation front, two streams of technology have emerged as critical to the biomarker effort. On the one hand, increasing resolution, sensitivity, and speed of high-end mass spectrometers  now enables the detection of tens of thousands of tryptic peptides (and by inference thousands of proteins from which they came) in complex biological samples.

On the other hand, a separate stream of quantitative MS technology, measuring preselected peptide ions based on two mass parameters (parent peptide and a specific sequence fragment, in so called multiple-reaction monitoring, or MRM, mode) provides a capability to accurately (~10% CV) quantitate 100 or more peptides at much higher throughput.

The sensitivity of multiplex MRM technology can be extended down to the ng/mL level and below by abundant protein depletion combined with limited fractionation or specific capture of the target peptides on antipeptide antibodies (the SISCAPA technique), covering a majority of the known biomarker proteins detected in blood plasma.

It is now clear that a functional biomarker pipeline needs both of these approaches: shotgun methods to search large numbers of peptides and proteins for potential disease-related differences, albeit with a high false-discovery rate, and MRM methods to construct accurate high-throughput assays to be applied to relevant Zolg-scale sample sets to verify performance in real populations.

Evaluating, optimizing, and implementing these and other recent advances are critical to solving the general biomarker problem. To do so, however, requires enlarging the focus of our efforts from technology-centric academic proteomics to a multidisciplinary (though possibly virtual) biomarker pipeline.

Interrelated Efforts

Productive relationships must be forged between disparate technology platforms and between technological, medical/biological, and statistical specialties. The U.S. National Cancer Institute is attempting to create a nucleus for this new approach in its Clinical Proteomic Technology Assessment for Cancer  program within the broader Clinical Proteomic Technologies for Cancer initiative.

Beginning with a critical evaluation of existing technology and methods, the CPTAC teams have designed and carried out true multisite reproducibility studies of both approaches: shotgun unbiased discovery and targeted MRM assays. The results, recently presented and now submitted for publication, are revealing.

As has been expected based on earlier, less well-controlled studies (e.g., the HUPO plasma proteome exercise), the shotgun approaches produce a statistical samples of  of the peptides in the proteome under study and thus often show significant differences in the sets of peptides from run to run both within and between laboratories (with greater similarity at the protein level).

The need for replicate runs to approach asymptotic completeness in proteome coverage is thus an inherent statistical feature of the method. The targeted MRM assays, on the other hand, derived from a widely used accurate quantitation approach for small molecules, can yield results that are accurate, reproducible (in this case across eight sites), and of wide dynamic range provided we restrict attention to a set of up to several hundred prespecified peptides.

These studies have confirmed the roles and fitness of these two approaches for discovery and verification of candidates in the biomarker pipeline, and provide confidence that both can be practiced effectively in multiple laboratories.

In parallel, it appears that MS-based measurements can deliver high-quality results in clinical laboratories as well. A specific example highlighting these issues is the clinical assay for plasma thyroglobulin, a thyroid-specific protein used to detect recurrence of thyroid cancer in patients whose diseased thyroids have been removed.

Andrew Hoofnagle, M.D., Ph.D.,  recently demonstrated an MS-based SISCAPA assay for peptides from thyroglobulin designed to circumvent several well-known and high-prevalence interferences plaguing the existing commercial immunoassays for this protein.

The prospects for major progress in protein biomarkers in readily accessible bodily fluids thus appear considerably brighter than even a year or two ago. The clinical and economic value of early detection of diseases like cancer, COPD, or Alzheimer’s is so great that once a real biomarker pipeline is proven, and the odds shifted from no chance to merely long, a strong case exists for more equal emphasis on the development of biomarkers and drugs.

N. Leigh Anderson, Ph.D. (, is founder and CEO of the Plasma Proteome Institute. Web: