August 1, 2007 (Vol. 27, No. 14)
Scientists rely on mass spectrometry to identify DNA, RNA, proteins, and naturally occurring metabolites. In proteomics research, mass spec is used to characterize proteins, identify single proteins in a mixture, and quantify cellular or oganismal proteins.
A recent market report, “Mass Spectrometry for Protein Biomarker Applications,” from Kalorama Information (www.kaloramainformation.com) points out that mass spec serves as the workhorse for discovering and validating biomarkers in clinical research, whether for drug development, diagnostics, or other applications. Kalorama notes that the market for mass spec in protein biomarker research currently totals $290 million and projects that it will grow to over $745 million by 2010. To obtain a better understanding of the role of mass spec in bioresearch, GEN recently held a roundtable discussion at our office on Current Trends in Mass Spectrometry for Proteomics and Metabolomics. Four mass spec experts took part in our Q&A.
We believe you will find their thoughts and insights, which are based on years of experience, interesting, valuable, and at times provocative.
Q. Why is mass spectrometry such a valuable technique in proteomics and metabolomics research?
Ashok R. Dongre, Ph.D.
One reason is sensitivity and the other is specificity. Those are the two important valuables that we need to stress, but with these advantages come challenges as well. Sometimes, mass spec is not as sensitive as some of the other methods, for example, biochemical techniques that utilize antibodies typically tend to be more sensitive than typical MS-based protein profiling approaches. One has to be concerned about that. That said, a combination of biochemical approaches with mass spectrometry enable high sensitivity as well as exquisite specificity.
Thomas J. Daly, Ph.D.
The ability to deconvolute complex tissue, serum, and plasma samples is another powerful advantage that mass spec brings that few other technologies are able to demonstrate.
I think an important aspect of mass spec is that you don’t need to have a reagent ahead of time. You don’t need an antibody for the specific compound or compounds you are looking for.
Also, if you are looking at posttranslational modifications, you often don’t have reagents such as antibodies available. With mass spec, you can identify those posttranslational sites.
Jens Hoefkens, Ph.D.
When our group looks at mass spec, we focus on biomarker discovery. What we find attractive is, like Ashok said, not only the high specificity and sensitivity but also the ability to observe a broad spectrum of the proteome.
Mass spec is also a highly reproducible technology, which is important to our customers and users and their applications in research and development.
Q. Integrating omics data from various sources remains a major obstacle in biomarker discovery, but this integration is key to getting a handle on biological processes at a number of functional levels. Can you tell us how your company or research group is dealing with this issue?
As a midsize biotech company, we don’t have the resources to build a large proteomics in-house capability, so we start with external collaborative efforts. For our clinical programs, we are talking to and working with several other companies and delivering patient clinical samples in that light.
We then follow up with our own focused in-house efforts to try and validate some of the potential biomarkers that we would see from the clinical samples.
So, we use a series of different techniques, including mass spec. We also rely on Luminex-based bead approaches and other technologies such as microarrays to give us a more global understanding of the data that come out of our clinical trials.
I think that many mid-sized companies integrate data very poorly. We had enough trouble just handling our own proteomic data, let alone trying to integrate it. It is only recently that software has become available to manage all of this data.
Another issue is that there is little correlation between the proteomic and genomic data, and so I have to wonder how valuable it is to try to integrate these data sources.
Our group is heavily involved in toxicogenomics, where the goal is to bring a variety of data sources, including transcriptomics, proteomics, and metabolomics data, together with more traditional measurements like serum chemistry and blood chemistry to try to build better predictive models.
While we initially hoped to see a correlation among the different data streams, we realized that the correlation is not necessarily as good as one might expect. However, when it comes to predictive modeling, that is not necessarily a problem. It just means you have more data on which to build your models and initially that’s good. We have in fact been able to show that the predictive power does increase when basing the models on a larger stream of input data.
I agree with what Gary said. There aren’t effective commercial software/bioinformatics tools available to perform a good integration of protein profiling, RNA-expression profiling, and metabolite profiling datasets to enable comprehensive understanding of biological processes and function.
At Bristol Myers Squibb, there is a fair degree of effort that is being put in from the bioinformatics group to try to integrate different data streams. The data sets are varied and that creates a challenge in itself. With microarrays (RNA profiling), you know the identity of everything that is on that chip. When you are analyzing protein-profiling data, especially when it is global profiling, you are taking your complex protein mixture and breaking it up into small peptides and then looking for what is differentially expressed.
What you get out is a unique mass for that particular peptide ion and its retention time. You don’t necessarily know the peptide’s identity. That comes later.
A mass spec based metabolite profiling approach is similar to protein profiling. Metabolite identification is typically performed subsequent to the profiling experiment. That said, based on recent literature review and research presented at “ASMS,” I think at the current time global metabonomic profiling dataset is probably going to be a bit more tractable than the global proteomic profiling dataset.
Q. Many laboratories have been using mass spec as the means to discover new biomarkers, and the success of these efforts relies on effective collaboration between clinicians and mass spectrometrists. What has been your experience in fostering these collaborations?
Very good. The clinicians come to us early. They know that we, as mass spectrometrists, are going to save them huge amounts of money if we can be successful at identifying a biomarker for their clinical trials and identifying which patients the trials should enroll.
So the clinicians are interested, but they do come to us before we are ready with the appropriate tools to deal with their data. Also, from what I’ve seen, the FDA is looking for biomarker information. So, again, the clinicians are coming to us, saying, “We have this new drug on the market and the FDA would like some biomarker information about it. What can you provide for us?”
Again, the issue is do we have the right tools to do the job and, in addition, can we do it in a cost-effective and timely manner?
Timeliness has been a big issue for us. We have identified putative biomarkers but then we have not had the antibodies available to validate them. In the meantime, the next-phase of a client’s clinical trial has started, so what have we really accomplished?
We have to pose the question: “Is this something we can accomplish in a reasonable time?” and, once we get results, “Is anyone going to be able to do something with these data again, in a reasonable time?”
I have had good success in fostering collaborations with clinicians at BMS. We have performed some experimental pharmacogenomic clinical trials where we used proteomics as one of the techniques to identify novel marker candidates. Some of our will work will soon be published.
I think proteomics is a relative newcomer when compared to RNA-expression profiling in a clinical setting. Furthermore, proteomics continues to rapidly evolve into a stable profiling platform. This sometimes leads to issues with timeliness and delivery of data/results and can create early disappointments. Clinical trials can run from a few weeks to several months. Typically, clinicians expect results within a few weeks of the close of a clinical trial to meet several key deliverables. So, we need to be careful about setting expectations when we discuss protein-profiling projects with clinicians, pick appropriate projects, and be expeditious with protein-profiling results and deliverables.
I agree with Gary that new tools are being developed. I like to use the analogy that we are trying to build a race car and working on improving it while racing it on the race track.
I think mass spec is going to be the future of how biomarkers (proteomic and metabonomic) will be discovered. But we need to be cautious as well as savvy about how we go about picking appropriate experiments and projects in areas of unmet need. We can not just perform “quick and dirty” experiments with low-resolution mass spec data and claim success of having identified biomarkers. Rigorous replication and independent validation of results in subsequent study is necessary to gain further confidence of candidate markers.
Sometimes clinical trials take years. In a experimental pharmacogenomic clinical trial in which I’ve been involved, it has taken over four years just to procure samples. Then it requires about six weeks to get LC/MS profiling data. The data acquisition is typically followed by several exploratory statistical data analyses, a lot of which is data-dredging, and this can take upward of a few months before you actually have a list of candidates that you want to go after and identify. Finally, biological context is typically required to establish further credibility of candidate biomarkers.
Then, we investigate availability of assays for these candidates. If one doesn’t exist, how long is it going to take to put an assay together? A targeted SRM/MRM-based mass spec assay, is typically the quickest way. That said, at times these markers may be present at low abundance, as a result, it may not be that straightforward and easy to establish a quick targeted mass spec assay. Combination of biochemical approaches with targeted mass spec detection may be required. This assay development approach depends on availability of biochemical reagents and may add a significant amount of time to assay development/delivery.
Discussing all these issues with clinical colleagues is necessary to build alignment, concensus, and thereby develop a successful collaboration.
Q. How viable is mass spec as a technology for point-of-care diagnostics?
Right now, it is not viable for point-of-care diagnostics. There is a future for it but probably not for another ten years or more. The technology is accelerating and it is moving in the right direction.
Right now, based on the amount of time and energy it takes companies to identify a biomarker, I suspect this information is going to be held close to the vest. So for all the resources that a pharmaceutical company or biotech company is going to put into biomarker research, why would they basically give everybody else a leg up?
So until there is a true collaborative need to put potential biomarkers for various disease states out in the public domain and validate them (which is another issue), it is going to be a long time before biomarkers make a greater impact.
Keep in mind that a single biomarker will not be sufficient. You are going to be looking for groups of five or six biomarkers that all move in the same direction or in a coordinated way to indicate a particular disease state. Will we ever see a mass spec instrument in a doctor’s office? I think that there is some probability of that at some point, although sample handling and processing will need to be simplified.
Totally agree with you. Although mass specs are getting cheaper and easier to use they are still quite expensive. There is a big problem with proper sample handling and getting the sample prepared for the mass spec. Reproducibility remains difficult to manage with mass spec. I don’t see this technology transitioning soon into a doctor’s office.
While I’m a bit hesitant to predict a timeline, I do think that mass spec is a technology that will lend itself to diagnostic applications. The basic underpinnings of mass spec are relatively straightforward to implement, but in a doctor’s office, I don’t see it happening any time soon. Based on the comparable situation in the field of gene expression and microarrays, I would predict that we are probably still ten years away from widespread diagnostics applications.
There are companies that are trying to put mass spec in a physician’s office. Time will tell whether that is going be successful or not.
For protein markers, ELISAs are easy to do, though they lack specificity. What mass spec needs to do is define a niche. Take, for example, measuring GLP/GIP levels in a diabetic patient, active as well as DPP4 clipped form. This is an example where one could potentially see mass spec tools gaining an edge over an ELISA-based technique and overcoming the difficulty of developing good antibody pairs to measure active versus total.
I see this as an example where one could potentially see a point-of-care diagnostic based on a mass spec platform. That said, I don’t know how long it would take to get us there, because it can be arduous to train people on issues of reliability and reproducibility as it pertains to a mass spec-based approach especially when its compared to just handing someone a 96-well plate, a good set of antibody reagents, and saying, “lets get an ELISA together.” Furthermore, biomatrices that require minimal sample processing such as plasma, urine, and saliva will make the best samples for MS based point-of-care diagnostics.
That said, for metabolomics markers, I anticipate more acceptance and relatively rapid uptake of MS-based point-of-care diagnostics. Typically, sample preparation for small molecule metabolites is straightforward and it is relatively easy to set-up single or multiple analyte MS assays.
Ashok’s mention of generating antibodies for an ELISA is a good point. If you are looking at differences in phosphorylation of a protein or a splice variant or a proteolytically clipped form that occurs in a particular disease state for example, it is going to be difficult to generate antibody pairs in these situations. Mass spec would be a sounder approach to identify these types of biomarkers.
I wanted to follow up on Tom’s comment about having biomarkers in the public domain. There is an NIH Alzheimer’s Disease Neuro-Imaging consortium in which a number of pharma companies are partners. The data that come out of this consortium will be publicly available.
Pharma is beginning to realize that we need to share, and we will eventually have enough data on and validation of novel marker candidates.
Q. A number of scientists have suggested that there has been a shift in mass spec-based proteomics from analyzing selected isolated proteins to now doing protein-wide analyses. This has led to new challenges in experimental design, data analysis, visualization, and storage. How are your MS proteomics teams dealing with this new approach?
I wish I had one of my flowcharts with me with all the different boxes showing how everything connects and how the various results from mass spec and other technologies, including different visualization and data-analysis tools, are pulled together.
For the most part, with a small or a midsize pharma company, we have had to wait for tools to become available. But it is amazing how much you still can do with an Excel spreadsheet. Data analysis is certainly the area that is continuing to improve, particularly with the use of statistics.
When we moved into mass spec based proteomics, we started to look into that as an area to develop software. We realized that in using LC/MS, people can generate huge quantities of data relatively quickly. What we try to do is to provide MS users with the tools that allow them to make sense of that data because this is a fundamentally different method from the way proteomics used to be done.
We also looked at statistical analysis, and developed tools that allow people to make sense of the data and to focus on what is actually interesting. In addition, we know it is important to provide people with tools that allow them to take results forward and to relate their findings to previously generated results.
This issue of how to do a proteome-wide analysis is an interesting and important question. It reinforces what I previously said about global protein profiling being difficult and complex. It is not just the mass spectrometer and sample prep, but it is also the data analysis. And when I say “analysis,” it includes statistical analysis, experimental design, visualization, identification, and then putting identified candidate markers into biological as well as functional context. As mentioned earlier, I can’t stress enough the need for subsequent independent qualification and validation.
Q. In a recent GEN article, David Klemmer, Ph.D., and his colleagues at the University of Indiana said that there is little variability in high-abundance proteins and it appears it is the lower-abundance components of the proteome that play the most significant role in determining unique features of individuals. How do scientists in your group tackle the difficulty of isolating and purifying low-abundance proteins?
This is one of the critical hurdles in this field. The large dynamic range of proteins in serum for example really hinder getting at low-abundance biomarkers due to contamination of samples with higher abundance species. In addition, unless the sensitivity of mass spec instrumentation continues to improve, the sample volumes required for analysis will be too large.Right now, we first try to remove the highest abundance proteins. Then we employ several orthogonal fractionation approaches to try to simplify protein pool complexity. Of course, all of the sample manipulation results in significant losses. But these fractionation methodologies are critical to concentrate proteins for analysis.
If we have a strong idea of what we are looking for, we can employ affinity chromatography or immunological methods to support our work. Going forward, however, high-resolution preparative technologies will also need to continue to develop to support mass spec based biomarker identification. In the end, a combination of improved instrument sensitivity, high-resolution purification methods, and creative insights will win the day.
Global as well as targeted protein-profiling techniques are dogged by this question. This issue is particularly acute as there isn’t a PCR for proteins. The concentration range of the proteome can vary from 106–1012 depending on biomatrix under investigation. Mass spectrometry typically is able to probe the 104 range from high to low abundance. Depending on the biomatrix under investigation and objectives of the study, a multitude of separation techniques are utilized.
When performing global plasma protein profiling, we typically employ immuno-depletion columns from various commercial vendors. If necessary, immunodepleted samples can be further fractionated by ion-exchange chromatography followed by RPLC/MS. If the objective of the study focuses on peptide/small protein (<18kD) profiling, we employ ultrafiltration based enrichment strategies or solid-phase extraction and sizing-chromatography techniques. With targeted profiling approaches, one can utilize specific-antibody-based immunoprecipitation approaches to isolate low-abundance proteins. So in essence, the objectives of the study dictate the separation and isolation approaches used to purify low abundance proteins for mass spec analyses.
As a software provider, we are obviously not so much involved with sample handling and preparation. However, based on our experience, we can certainly second the observation that many significantly discriminating proteins can be found in the lower abundance components of the proteome. Some of our partners have been rather successful in using large-scale fractionations to increase sensitivity of mass spec analysis.
Interestingly, gel-based preseparation techniques might make another comeback as well. We are working with a number of customers that are successfully using 1-D and 2-D gel separation in combination with LC-MS to look at low abundance proteins.
For discovery of biomarkers from human plasma by mass spectrometry, we have used a variety of methods to enrich the sample for low-abundance proteins. These include affinity-column depletion of abundant proteins, precipitation of high-molecular-weight proteins, and multi-dimensional column chromatography.
Q. What technological advances might be needed to bring the use of mass spec in proteomics and metabolomics research to higher levels of efficiency and effectiveness?
Improvements in instrument sensitivity must continue in order to decrease sample volumes needed for analysis. As I mentioned before, I think concomitant development of high-resolution, high-throughput preparative methods needs to continue as well. Finally, smarter out-of-the-box software must be developed to allow rapid triage of data and to find correlations among multiple biomarkers.
Here is a wish list that comes to mind. Some points may be achieveable others maybe a stretch.
a) Extending the dynamic range of the mass spec approaches from the current 104 to about 108 with minimal sample preparation and minimal penalty on sample throughput.
b) Better consumable (LC columns, plastic ware, SPE cartridges, etc) and reagents for efficient sample preparation that maximizes sample recovery and minimizes loss during sample handling. These improvements will exploit the inherent sensitivity afforded by mass spectrometry.
c) Better separation efficiency of highly complex mixtures without multiple chromatographic separation steps and minimal penalty on sample throughput.
d) One can always ask for MS with higher resolution and mass accuracy (without sacrificing sensitivity) in mid to low ppb range. That said, current hybrid ion trap–FTMS instruments are very good choices. Hopefully we will see evolutionary enhancements to these new mass specs in terms of resolution, mass accuracy, and sensitivity.
While modern mass spec instrumentation provides large dynamic ranges and high sensitivity, we would still like to see improvements in m/z resolution. Together with some of our customers, we believe that higher resolution would enable novel identification applications in proteomics and metabolomics. Currently, the process of peptide and protein identification is based on MS/MS scans and search engines like MASCOT and Sequest and with improved mass accuracy it will probably become possible to use algorithms that are using the exact mass to do the identification.
In addition, we often see that users struggle with LC and GC columns and their reproducibility. So from that perspective, it would certainly be desirable to see some improvements in this area.
Although improvements in the sensitivity and dynamic range of mass spectrometers would undoubtedly help, I think the major problem now is to find better methods for reproducible sample preparation that allow for quantitative analysis of low-abundance proteins.
Thomas J. Daly, Ph.D., is vp, preclinical development and protein chemistry, at Regeneron Pharmaceuticals. Gary Davis is a proteomics consultant. Ashok R. Dongre, Ph.D., is associate director, clinical discovery technologies, at Bristol-Myers Squibb. Jens Hoefkens, Ph.D., is managing director at Genedata.