DNA sequencing opened research and translational vistas of breathtaking scope—but eventually we started to catch our breath. Over time, the views afforded by DNA sequencing started to feel limited and incapable of giving us a glimpse into many of the most compelling questions about human disease. To broaden our outlook, we have combined genomic surveys and transcriptomic surveys, and still the view can seem, well, blinkered. So, we are now beginning to take in an even broader horizon, a multiomics view that includes not just DNA sequences and RNA transcripts, but also proteins.
Keeping an eye on proteins—that is, performing proteomics analysis—is enormously challenging. The proteins in our bodies may rise or fall in abundance or modify their structures and functions depending on our age, our state of health, and other contingencies.
Most often, proteins are seen through the lens of mass spectrometry. With the advances in mass spectrometry that have occurred over the past 10–20 years, proteomics has become more sensitive, faster, and higher in resolution. Today, proteomics is adding to what we can see with genomics, transcriptomics, and metabolomics, giving us a fuller (and more dynamic) picture of biological processes and pathways as well as of drug targets and mechanisms.
Mapping drug mechanisms
The mechanisms of action of drug compounds in the body are often unclear. However, these mechanisms can be clarified if a proteomics approach is used to identify which proteins are affected by the drug compounds. Information about thousands of proteins can be captured through traditional mass spectrometry–based proteomics, but additional information could be captured through an even more powerful approach, suggests Benjamin Ruprecht, PhD, associate principal scientist, Merck & Co. He has in mind an approach that is capable of higher throughput and of finding associations between drugs and elements of the proteome and transcriptome.
Proteins inside a cell are constantly being turned over. Although transcriptomic techniques can show the presence of specific transcripts, they say little about the proteins that correspond to those transcripts. For example, these techniques are silent with respect to a protein’s abundance, location within the cell, or fate—whether it is secreted, for example. Measuring proteins directly offers a potentially much more accurate picture of the biology of the cell and its responses to specific compounds. As a proof of principle, Ruprecht and his colleagues used advanced mass spectrometry techniques to create a proteome map of drug effects in lung cancer cell lines.1
Ruprecht’s study confirmed that protein- and transcript-level changes in response to drug treatment are not identical. It also showed that aggregating drug-induced protein-level changes across cell lines provided important information for characterizing the mechanism of action of small-molecule drugs.
“Our dataset enabled us to annotate which proteins respond to the majority of drugs,” says Ruprecht. “Removing these ‘frequent responders’ strongly improves our ability to determine selective protein changes caused by an individual compound.” He adds that this type of experiment also reveals direct drug targets for one quarter of the small molecules studied while turning up novel mechanisms and potential drug resistance pathways.
Although this proof-of-principle study was successful and yielded important insights, it indicated that mass spectrometry–based proteomics technology still needs improvement. Ruprecht notes that areas in need of improvement include instrument sensitivity, sample preparation and processing, and data analysis.
Increasing sensitivity for deeper proteomics
Multiomics studies rely on data sets that reflect the use of several different omics approaches. Over the last 4–5 years, many of these studies have been incorporating proteomics approaches, observes Diarmuid Kenny, PhD, a group leader for integrated biology at Charles River Laboratories. He notes that shotgun proteomics, also known as data-dependent acquisition (DDA), has long been the method of choice for identification and quantification of proteins from complex mixtures, but that recently data-independent acquisition (DIA) has emerged to harness the increased speed and sensitivity of newer mass spectrometers.
In DDA, peptides in the mixture are preselected for identification, whereas in DIA, all peptides in the sample are fragmented, resulting in fewer missing values and thus much deeper proteome coverage. When high-field asymmetric waveform ion mobility spectrometry (FAIMS) is used, background ions can be prefiltered from extremely complex samples, lowering detection limits and improving the overall quality of the data.
“The incorporation of FAIMS into the latest generation of high-resolution mass spectrometers has seen a great improvement in the overall coverage of the proteome,” Kenny says. “Additionally, FAIMS has the ability to separate post-translationally modified peptide isomers, offering some exciting opportunities to develop new methodologies specifically focused on identifying novel post-translational modifications.”
Innovative applications of mass spectrometry in proteomics analysis typically involve improvements in instrument sensitivity, emphasizes Andreas Hühmer, PhD, senior director, omics and mass spectrometry, Thermo Fisher Scientific. He asserts that Thermo Fisher’s Orbitrap Eclipse Tribrid mass spectrometer represents the most sensitive available technology for measuring subtle cellular changes such as protein–protein interactions or drug–protein interactions.
For example, with highly sensitive limited proteolysis–mass spectrometry (LiP-MS) instrumentation, investigators “can see whether a particular structural change in protein causes changes in metabolism,” Hühmer points out. “In that particular case, you can utilize the proteomics approach to explain metabolomics.”
He adds that a similar approach would be applicable to proteogenomics, where knowledge of genomics informs proteomic analysis of the same sample. “Proteomics is becoming more functional and more democratized,” Hühmer maintains. He expects that scientists who are not mass spectrometry experts will find it easier to apply the technology in their specialized fields. “An area that will be very important for democratization is what we call intelligent mass spectrometry,” he notes. “Here, ‘intelligent’ refers to how an instrument may be aware of the question the user is asking. Also, the instrument may know the best settings and parameters for optimizing the results.”
Increased instrument sensitivity in mass spectrometry eases the study of proteins that occur in low abundance or must be sought in very small samples. Such proteins typically escape traditional methods of detection. Whereas single-cell genomics and transcriptomics studies are well underway, the field of proteomics is just on the cusp of achieving single-cell sensitivity.
Approaching single-cell analysis
“In terms of new advances of mass spectrometry for proteomics and how proteomics can be combined with other omics technologies, I think the emergence of spatial genomics and application of single-cell assays are very important,” declares Guanghui Han, PhD, director, San Jose Mass Spectrometry Center, BGI Americas. He believes that combining these technologies will make it possible to reveal once-hidden patterns.
For example, cells presenting different proteome profiles may be distributed though tissues or organs in meaningful three-dimensional arrangements. Also, proteome profiles at the single-cell level may change over time in response to a stimulus or perturbation.
To date, most proteomics studies have generated protein profiles for samples that have been homogenized and that contain material from many cells. In these studies, cellular heterogeneity is lost or “averaged out.” To preserve information about cellular heterogeneity, BGI is offering an “approximate” single-cell proteomics service called nanoproteomics.
Using the Orbitrap Eclipse MS instrument from Thermo Fisher, BGI performs quantitative proteomics analysis on around 100 cells by combining in situ cleavage with data-independent acquisition label-free technology.
In the pursuit of single-cell and near-single-cell proteomics, technology developers must focus on sample preparation. Unlike DNA or RNA, proteins cannot be amplified.
In traditional protein sample preparation, there are many steps during which the sample is processed and transferred from tube to tube. As the sample moves from step to step, the proteins in the sample become less and less abundant. Any proteins in the sample that were initially present in extremely low amounts, that is, in amounts lower than 300 picograms, become so scarce as to be undetectable by the profiling assay.
Han suggests that this problem can be overcome by the streamlined protocols in single-cell proteomics. These protocols limit the number of steps and minimize sample loss.
Some technological advances are still needed for proteomics to catch up with genomics in terms of the scale and granularity of the data it can return. Another serious hurdle is the outdated perception that proteomics relies on limited, low-throughput technology.
Although high-throughput technology is being deployed in proteomics, the field has yet to experience a galvanizing moment like the one genomics experienced with the Human Genome Project. Last year, soon after a high-stringency blueprint of the human proteome appeared in Nature Communications,2 an editorial in the Journal of Proteome Research noted that the finishing or polishing phase for the human proteome was expected to be more difficult than that for the human genome.3
“To complete this task, we need new technology for improved coverage, higher sensitivity for single-cell proteomics, and machine-learning-aided bioinformatics,” the editorial argued. “The proteomics community needs government and institutional awareness to provide support and resources for these essential research challenges.” According to Han, a campaign to accelerate the finishing phase has yet to be organized.
Multiomics analysis from a single assay
Dalton Bioanalytics is taking the multiomics approach one step further by combining analysis of proteins, metabolites, lipids, electrolytes, and other small molecules into a single assay. “We keep everything in the sample,” says Austin Quach, PhD, Dalton’s co-founder and CSO. “[Our approach is] kind of counterintuitive, because oil and water don’t normally coexist in the same solution.”
Quach adds that the company takes macromolecules, such as proteins, carbohydrates, and RNAs, and digests them into small pieces. Then other constituents of the sample, such as metal ions, electrolytes, and lipids, are complexed with solubilizing and ionizing reagents. These preparations make each component amenable to analysis in a single liquid chromatography–mass spectrometry assay that gives readouts on proteins, lipids, metabolites, and electrolytes. Sugars and oligosaccharides are also visible in the assay, whereas analyses for complex carbohydrates and RNAs are still in development.
Quach reasons that if you’re just doing one omics analysis alone, you’re potentially missing out on important channels of biological information. He suggests that working with a more complete data picture can yield unexpected insights.
“In a recent customer study, we found some strange occurrences,” Quach notes, adding that some effects on metabolites were observed that were associated with “dramatic” changes in the protein content. “We wouldn’t have seen these major changes in the proteomics side of the samples if we were just focusing on metabolites.”
Beyond shotgun analysis
The increased sensitivity of mass spectrometry analysis workflows goes hand in hand with larger and more complex data sets. Whereas in the past, an ambitious shotgun proteomics analysis might capture 2,000–3,000 proteins, a single injection can now quantify 10,000 proteins. That’s according to Oliver Rinner, PhD, co-founder and CEO of Biognosys. He says the complexity of those deep protein analyses require advanced bioinformatics to process the data.
A large portion of Biognosys’s business is biomarker discovery for early-stage research, but the company is moving into later stage applications, such as clinical trials. Rinner says that as analyses deepen—that is, as they take in more proteins—the chance of finding the right biomarker candidates increases. On the other hand, data analysis remains challenging for scientists who are unfamiliar with proteomics data; therefore, the company provides advanced data analysis and interpretation on top of measurement results.
Biognosys’s mass spectrometry workflow acquires data in a highly parallel manner and analyzes it using reference spectra via a proprietary algorithm. The output is a comprehensive, peptide-level measurement of all detectable proteins in the sample.
Rinner declares that proteomics has vast potential to transform the life sciences due to the role of proteins as the functional unit of the body: “The genome gives us essentially a one-dimensional readout. It’s a sequence with mutations or variations. The proteome gives us a dynamic view of function on all levels: quantity, modification, and even structure. If we can read that out, on a large scale, with high-data quality, we will get a very different view of biological function.”
1. Ruprecht B, Di Bernardo J, Wang Z, et al. A mass spectrometry-based proteome map of drug action in lung cancer cell lines. Nat. Chem. Biol. 2020; 16: 1111–1119. DOI: 10.1038/s41589-020-0572-3.
2. Adhikari S, Nice EC, Deutsch EW, et al. A high-stringency blueprint of the human proteome. Nat. Commun. 2020; 11(1): 5301. DOI: 10.1038/s41467-020-19045-9.
3. Overall CM. The HUPO High-Stringency Inventory of Humanity’s Shared Human Proteome Revealed. J. Proteome Res. 2020; 19(11): 4211–4214. DOI: 10.1021/acs.jproteome.0c00794.