March 1, 2016 (Vol. 36, No. 5)
Richard A. A. Stein M.D., Ph.D.
The Quantitative Side of Proteomics Has Become Increasingly Critical For the Life Sciences
In virtually every area of the life sciences, researchers rely on protein characterization. Yet protein characterization misses entire dimensions of the protein universe. One such dimension covers the repertoire of proteins, the identity of all the constituents of the proteome.
Another dimension concerns size, the number of all proteins in the proteome, and the number of each proteome constituent. Both of these dimensions harbor interesting biology, which becomes all the more apparent when quantities within them are tracked over time or evaluated for shifts that reflect changing conditions.
The quantitative side of proteomics has become increasingly critical for the life sciences. Still, it continues to challenge researchers in biotechnology and medicine. To harness it, researchers must find a way to tame the inherent wildness of the proteome. For example, proteins exhibit dynamic behaviors, they are present in wide-ranging concentrations in biological fluids, and they exhibit a wealth of post-translational modifications.
“Ten years from now, we hope to perform comprehensive quantitation of any proteome in a single analysis, very much in the same way that RNA-Seq is being performed nowadays,” says Robert L. Moritz, Ph.D., a professor at the Institute for Systems Biology. Research in Dr. Moritz’s lab focuses on the development of new technologies and data analysis tools to quantitate proteins and help dissect their function, particularly in the context of disease.
A key challenge during mass spectrometry experiments is that some peptides might not be visualized while the proteome is under scrutiny, even though they would have been deemed likely to have been present. Peptides might relate, for example, to a protein of origin expected to occur on the basis of genomic and transcriptomic data.
“Depending on the sequence, we may or may not be able to see the entire sequence of a protein,” Dr. Moritz notes, “but we would see some of the peptides all the time.” Establishing whether failure to visualize a peptide is related to the low level of detection by the instrument or to other reasons emerges as a priority during experimental work. “It is possible we don’t see any peptides of a particular protein because the protein is only expressed at the particular moment or location when we are not looking,” Dr. Moritz explains.
Many peptides would be expected to be observed during mass spectrometry experiments. In fact, a database of them has been compiled. This database, the PeptideAtlas, was established in 2005. It has grown considerably since and now represents a multispecies compendium. It serves as both a data resource and a repository, providing a key resource for targeted proteomics experiments.
“The PeptideAtlas allows investigators to go back in history and determine how many different people have seen a specific peptide all the time and what constitutes the proteotypic signature of a specific protein,” declares Dr. Moritz. Additionally, the PeptideAtlas provides a resource to guide and validate mass spectrometry findings, design controls during experiments, and confirm experimental data.
Recently, Dr. Moritz’s group, which includes senior research scientists Eric Deutsch, Ph.D., developed AtlasProphet, an improved informatics processing platform that improves accuracy and coverage and interprets data in the context of a reference proteome. “Mass spectrometry has a lot of scope for improvement, especially in terms of speed and sensitivity,” asserts Dr. Moritz. “We are already seeing many emerging technologies that can capture all the proteins or isoforms and quantitate them.”
Tracking Proteomic Markers of Neurodegeneration
“We found three interventions that may lower the risk of Alzheimer’s disease,” says D. Allan Butterfield, Ph.D., a professor of biological chemistry at the University of Kentucky. “All this came about from proteomics studies.”
In a collaborative effort with University of California, Irvine researchers Carl Cotman, Ph.D., and Elizabeth Head, Ph.D., Dr. Butterfield and colleagues examined, in an animal model of human aging, the impact of three lifestyle interventions on the risk of cognitive impairment. The interventions consisted of changes to diet, physical activity, and intellectual stimulation.
The animal model was the beagle dog, which possesses a beta-peptide sequence identical to that of humans. It also has a highly developed ability to learn complex cognitive tasks. These attributes were taken by Dr. Butterfield and colleagues to indicate that the beagle dog would be a useful animal model for assessing the course of Alzheimer’s disease in humans.
Dr. Butterfield and colleagues exposed 12-year-old beagle dogs to an antioxidant-rich diet, a physical enrichment protocol, and an cognitive enrichment protocol. These interventions lasted for almost three years.
“Cognitive tests showed that 15-year-old beagle dogs on these three interventions had an error rate similar to what is usually seen in a 4-year-old dog,” reports Dr. Butterfield. Cognitive improvement was accompanied by a decrease in the amyloid beta levels in the brains. Dr. Butterfield adds that the oxidative stress levels in the brains of these dogs were also similar to the levels seen in 4-year-old dogs.
Dogs that were fed only dog chow presented brain pathology reminiscent of aspects of Alzheimer’s disease, and proteomics revealed the presence of many oxidized proteins. The use of proteomics in this analysis helped characterize the protein interactome and to define protein groups that are involved in the response, and confirmed previous findings, which pointed to the involvement of specific proteins and pathways in the cognitive improvement that results from dietary and lifestyle interventions.
“A current challenge in proteomics is that the data output is enormously difficult to deal with,” observes Dr. Butterfield. “But when bioinformatics catches up with mass spectrometry and protein separation advances, this will mark an advancement for proteomics.”
One of the existing challenges in the field of neurodegenerative diseases is the gap in being able to capture the earliest changes, before the onset of clinical disease, with the hope to slow disease progression and increase quality of life and life expectancy. In an analysis that examined the mitochondrial proteome, Dr. Butterfield and colleagues found a significant increase in oxidative stress markers and in the levels of the ATP synthase beta subunit, along with other proteins, in peripheral lymphocytes from individuals with mild cognitive impairment and Alzheimer’s disease as compared to cognitively normal individuals.
Early detection is particularly actionable in Alzheimer’s disease, where the earliest molecular changes occur 20 years prior to clinical disease. “There is a long way ahead before we can be certain of our work’s significance,” concludes Dr. Butterfield. “But the finding that the same proteins that accumulate in the Alzheimer’s disease brain are changed in mitochondria from peripheral lymphocytes opens the possibility to develop biomarkers.”
Profiling Proteins Overexpressed in Cancer
“A major challenge in the field of proteomics is being able to look at large numbers of samples simultaneously,” says David M. Lubman, Ph.D., a professor of surgical immunology at the University of Michigan Medical School. With most widely available isotopic tagging methods, between four and eight samples can currently be tagged and processed at the same time. “There are efforts that are underway from many groups to improve this,” continues Dr. Lubman, “but those tools are not yet readily available.”
In a recent study, Dr. Lubman and colleagues used cancer stem cell markers to isolate two populations of breast cancer stem cells, ALDH-positive and CD44-positive/ CD24-negative cell populations. Cancer stem cells are a particularly challenging cell type in understanding cancer biology and in developing therapeutic approaches because they have been associated with resistance to treatment and with aggressive behavior, including metastatic dissemination.
Using proteomic analysis to compare each of the two breast cancer stem cell populations with differentiated breast cancer cells, Dr. Lubman and colleagues identified 3,304 proteins. Next, using a label-free quantitative method, they identified molecular regulatory networks that characterize the breast cancer stem cell phenotype and promise to provide insight into their biology. “During this study,” reports Dr. Lubman, “we found a few key proteins that we are interested in pursuing further, particularly in terms of metabolomic studies.”
Label-free quantitative proteomic analyses can be performed using spectral counting, an approach that measures the total number of tandem mass spectra that match peptides to a particular protein and determines the abundance of a specific protein within a complex mixture. “We would like to be able to look at small amounts of samples (<10,000 cells), particularly for stem cells,” confides Dr. Lubman. “We have very small numbers of cells to work with, but using tags for proteomics analyses on very small numbers of cells is still a real challenge.”
Capturing Biological Variation
“An aspect that the general field of proteomics is still facing, but could be addressed with further technological developments, is that proteomics experiments are so time-consuming that often not enough replication is done,” insists Leonard J. Foster, Ph.D., a professor of biochemistry and molecular biology at the University of British Columbia. “Because existing approaches do not always accurately reflect biological variation, conclusions are not as well supported as they should be.”
An active area of investigation in Dr. Foster’s group is the interaction between honeybee Apis mellifera and the ectoparasitic mite Varroa destructor. Two of the heritable defense mechanisms that honeybees have developed as collective systems of behavior to protect against parasitism—hygienic behavior and Varroa-sensitive hygiene—show marked variability among colonies, but their genetic and biochemical bases are relatively poorly understood.
To explore the involvement of different protein expression profiles in the honeybee’s intercolony variability in resistance, Dr. Foster and colleagues measured the relative abundance of about 1,200 honeybee proteins in honeybee antennae and larval integument, two tissues involved in their immunity and interactions with the parasite, and correlated them with resistance behavioral phenotypes to identify and characterize proteins that shape the host-pathogen interface. For the first time, this approach shed light on the correlation between protein expression patterns and honeybee behavioral traits.
In another recent development, investigators in Dr. Foster’s lab performed a quantitative proteomics analysis that used galactose-inducible and glucose-repressible expression of the protein kinase A1 gene, which encodes the catalytic subunit of PKA. The investigators used this analysis to identify regulated proteins in the secretome of a fungal pathogen, Cryptococcus neoformans, the cause of necrotizing fasciitis, or flesh-eating disease.
“This is where quantitative approaches are invaluable,” asserts Dr. Foster. “Even semiquantitative approaches would not have had the level of accuracy that is needed to distinguish proteins that are secreted from the proteins that are present in the culture supernatant.” The quantitative approach taken by Dr. Foster’s team led to the identification of 61 proteins in the secretome, five of which are regulated by Pka1, and fulfill roles related to virulence, fungal survival in the host, and iron uptake.
Post-Translational Challenges
A hallmark of proteins, the presence of post-translational modifications, opens significant challenges to mass spectrometry experiments for several reasons, including the resulting mass shift in the peptide molecular weight and technological limitations in quantitating and evaluating the stability of the modified peptides.
“There was a significant push in the chromatin biology field to develop high-quality antibodies that recognize post-translationally modified epitopes,” says Michael A. Freitas, Ph.D., associate professor of molecular virology, immunology, and medical genetics at Ohio State University. “Currently, there are quite a few high-quality antibodies available.”
However, even when high-quality antibodies are available, the shape of the very specific epitopes that they recognize might be affected by neighboring post-translational modifications. “A classic example is provided by the histone H3 K9S10 epitope, in which acetylation of lysine at position 9 and phosphorylation of the serine at position 10 can affect the ability of an antibody to recognize either of these two post-translational modifications,” explains Dr. Freitas.
An approach extensively used in Dr. Freitas’ lab for mass spectrometry analyses is stable isotope labeling by amino acids in cell culture (SILAC). “SILAC is very amenable to cell lines because a cell line grown in its normal media can be compared to a treatment sample,” notes Dr. Freitas.
In SILAC, a metabolically labeled amino acid can be stably incorporated in vivo into a protein. As a result, two cellular populations, grown in media containing amino acids labeled with different isotopes of the same element, can have their cellular proteomes labeled metabolically. “Because SILAC allows relative quantitation, we can determine the degree of post-translational modifications in a single mass spectrometry experiment,” asserts Dr. Freitas.
Although SILAC circumvents many of the inherent sample-preparation variables, a disadvantage is that it requires the growth of cell cultures and it is limited to the use of one or two labels. “Therefore, SILAC is not amenable to high multiplexing or to samples that cannot grow,” warns Dr. Freitas.
Another key limitation is that in SILAC, as in several other widely used approaches referred to as “bottom-up” proteomics, proteins are enzymatically digested, creating peptide fragments of the protein. “Examining an N-terminal peptide might not inform about whether or not a modification present in that region is associated with a modification in another domain,” points out Dr. Freitas. “This is the biggest limitation that we have to live with.”
As a result of consulting and collaborative endeavors, Dr. Freitas and colleagues implemented the use of top-down mass spectrometry, an approach in which an intact protein is placed into the mass spectrometer and fragmented using high-energy collisions. “Then we essentially take those fragments and reconstitute the protein,” states Dr. Freitas. This allows key knowledge about the combinatorial post-translational modifications from that protein to be retained.
Using top-down mass spectrometry, Dr. Freitas and colleagues recently described the involvement of several histone H1 variants in the cellular progression of breast cancer cells, identified and quantitated the dynamics of proteoforms during cell cycle progression, and unveiled post-translational modifications that could become potential biomarkers for malignant proliferation.