Proteome Maps Open an Age of Biological Discovery

Protein correlation profiling, single-cell imaging, and other spatial proteomic techniques are revealing how proteins are distributed within cells and tissues

Spatial proteomics began hardly more than a decade ago, but it is already gathering momentum. Consider the progress that has been made in the subcellular mapping of proteins. In 2006, pioneering researchers from the Max Planck Institute for Biochemistry reported that they had generated a mammalian organelle map.1 The same year, researchers from the University of Cambridge indicated that they had mapped the Arabidopsis organelle proteome.2 Such efforts demonstrated what could be achieved with density gradient centrifugation and marker protein profiling, and they stimulated a proliferation of experimental techniques for determining how proteins are distributed within cells and tissue.

“Spatial proteomics has been around, in an infant way, for a little while, but I think we’re learning more about the importance of identifying where different proteins are within the cell and body, and how this affects their function,” says Alla Gagarinova, PhD, a postdoctoral fellow at the University of Saskatchewan. Gagarinova is among the researchers using new techniques to explore the proteome’s spatial dimension—both within cells and between them. The new techniques have many applications, from basic science to therapeutic development to pandemic preparedness.

High-resolution instrumentation

Spatial proteomics has developed considerably in the last decade, according to Laurent Gatto, PhD, an associate professor of bioinformatics at the Catholic University of Louvain. He explains, “Experimental techniques, by which I mean mass spectrometry, have moved on considerably.”

The instruments available today have higher resolution than the ones that were available 5 to 10 years ago. “And that’s going to continue evolving,” Gagarinova points out. “So, we’ll be able to look at [which proteins are] in a cell, where they are, what they do, and how that can change.” She adds that researchers are already shifting beyond the “lowest hanging fruit” of identifying the most abundant proteins to mapping proteins in the nucleus and organelles in cells.

Computational tools

Computational techniques have also improved, and a wider range of software available to researchers. “I have the feeling [researchers] don’t fully understand the sheer range of software and approaches [available],” Gatto says. “This is a rapidly emerging field.”

At the Belgium Proteomics Association Conference (cancelled due to COVID-19), Gatto planned to give a presentation on using bioinformatics to map the location of proteins inside a sample. In this presentation, Gatto would have shared an observation that he states here as follows: “You might have a complex experimental design and have collected data over weeks or months, but once you start the analysis, you only need two pieces of information.”

By “two pieces of information,” Gatto means two bodies of information—the first pertaining to marker proteins (that is, proteins with experimentally verified locations), the second pertaining to proteins for which location information is available only from a database.

In a dataset with 5,000 proteins, a researcher might be able to preannotate 1,000 proteins, leaving 4,000 with their location unknown. A simple analysis might then use Principal Component Analysis to see if any clusters can be identified in the data.

“If we don’t see any clusters, that’s not a great sign,” Gatto remarks. “However, if these clusters match well-known protein markers, it’s a good sign that, okay, my data or my experiment worked, and that it’s possible to carry on with more complex analyses.”

A researcher might proceed by using a classification algorithm to predict the location of the unknown proteins based on the characteristics of the markers, he continues. Alternatively, a researcher who didn’t have enough markers for every organelle in a cell might use semisupervised machine learning to guess the locations of unknown proteins.

Essentially, the researcher is asking the algorithm a question. In Gatto’s words, the question is: “There’s stuff missing, please can you see if there’s meaning here as well?” Many proteins are also found in multiple locations. Consequently, a researcher may want to ask an additional question, one that Gatto puts as follows: “Is my algorithm able to identify them?”

Global interrogation

Spatial proteomics refers to two different activities, explains Kathryn S. Lilley, PhD, a professor of biochemistry at the University of Cambridge. One activity is looking at different cells within a tissue slice and determining the groups of proteins associated with them. The second activity is identifying where proteins are located within a cell. Lilley says that her research focuses on the second activity.

“Our method has been around in concept for a long time,” Lilley notes. “We first published [about it] in 2004.” Her research was inspired, she recalls, by the difficulty of mapping proteins to specific organelles. Since 50% of proteins were present in more than one location, and many organelles had similar physical properties, even if her team could purify and enrich proteins in one niche, they wouldn’t know where else each protein could be found.

Lilley’s team turned to protein correlation profiling, a technique from the 1950s. Each subcellular niche is strung along a different subcellular fraction by centrifugation, and the protein amounts in each fraction can be measured. “You get rich and complex data,” she says. “It’s taken us a long time to come up with computational methods to deal with these data and use the method comparatively.”

In a paper that recently appeared in Nature Communications, Lilley and colleagues described how the abundance of proteins in human immune cells changed in the 12 hours after a proinflammatory response was induced by lipopolysaccharide treatment.3 The researchers used novel Bayesian statistical analyses to estimate the probability of a protein belonging in a location and moving to another location under this physiological stress.

Probabilistic modeling of the spatial proteome
Figure 1. Probabilistic modeling of the spatial proteome: probability ellipses are modelled using known marker proteins, and used to reliably classify proteins to one or multiple locations. [Image and caption provided by
Laurent Gatto, PhD]
“What was surprising, which we didn’t expect to see at such a stark level, was that many proteins weren’t changing in abundance—but they were changing in location,” Lilley points out. This finding currently lacks a definite explanation. “It is not possible,” Lilley and colleagues noted in the paper, “to distinguish trafficking of existing proteins from one location to another, from proteins being differentially degraded at one location and newly synthesized proteins locating to an alternate location.”

“The paper,” Lilley declares, “threw up more questions than it answered!”

Single-cell imaging

Nikolai Slavov, PhD, is also studying proteins at a subcellular level. An Allen Distinguished Investigator and Associate Professor of Bioengineering at Northeastern University College of Engineering, he says his research differs from Lilley’s research in a crucial way: “She is using 10 to 100 million cells and measuring the average across them, whereas we’re doing one cell at a time.”

Slavov’s main project is working on the Human Cell Atlas, an international collaboration to map the position, function, and characteristics of every cell type in the body. He is using single-cell imaging to measure the abundance of RNA and proteins in single cells, and then to localize them in three dimensions within body tissue.

Speaking about the benefits of single-cell imaging, he says: “Population-wide averages have existed for more than a decade and are substantially easier to do, but human cells are different functionally, and that’s why we want to have single-cell resolution.”

As a secondary project, Slavov is using single-cell imaging to measure the abundance of proteins inside macrophages when these immune cells are activated by antigens. “Looking at protein abundance can help us identify cell types,” he notes, “but you need the additional dimension [of single-cell analysis] to understand the molecular mechanisms behind human disease.”

A major application of his work is how to make tumor-associated macrophages more proinflammatory, so they can be used to fight cancer. He is also seeking to understand the role of macrophages in autoimmune and neurodegenerative disease.

Cataloguing cell types

Charlotte Stadler, PhD, focuses on the first type of spatial proteomics—understanding the difference between cells within the same tissue type. In her research at the KTH Royal Institute of Technology and at SciLifeLab’s Spatial Proteomics Facility, Stadler uses highly multiplexed imaging to look at 20–30 different proteins in each cell.

colorectal cancer core
Figure 3. Image from a colorectal cancer core using sequential indirect immunofluorescence (image provided by Spatial Proteomics unit at SciLifeLab). Colors reflect the markers DAPI (white), Ki67 (green),
CD68 (orange), CD3 (red), CD8 (pink), PARP1 (blue)

“We can look for different proteins in a single sample,” she says. We can, she continues, identify cellular phenotypes and get clues about the expression and cellular function of proteins.

Stadler and colleagues use fluorescence-based methods to study the proteins. According to Stadler, these methods overcome the problems of spectral overlap between multiple fluorophores by looking at a few proteins at a time and washing away the fluorophores between each imaging step.

These methods allow researchers to look at multiple different proteins in the same sample. By using imaging, researchers can visualize proteins at single-cell resolution and capture cell dynamics, including rare events that would be masked if bulk samples of whole cell populations were analyzed.

Optimizing and benchmarking MS analysis with bulk standards modeling SCoPE2 sets
Figure 2. Optimizing and benchmarking MS analysis with bulk standards modeling SCoPE2 sets. A Conceptual diagram and work flow of SCoPE2. Cells are sorted into multiwell plates and lysed by mPOP [24]. The proteins in the lysates are digested with trypsin; the resulting peptides labeled with TMT, combined, and analyzed by LC-MS/MS. SCoPE2 sets contain reference channels that allow merging single cells from different SCoPE2 sets into a single dataset. The LC-MS/MS analysis is optimized by DO-MS [25], and peptide identification enhanced by DART-ID [26].
Stadler emphasizes that these methods are useful for answering any biology-related question where it makes sense to study intact tissue. She gives the example of studying the immune landscape in tumors, including which immune cells are present and how they’re orientated within the tissue. “Many studies,” she notes, “are investigating what this means for patients, and whether it can be used to predict who’d benefit from specific therapies.”

The future of spatial proteomics

Stadler predicts that researchers will use imaging in combination with other methods to combine spatial information with more in-depth data on proteins or other biomolecules. For example, researchers may use laser dissection microscopy to cut out specific cell types, which can be analyzed further using mass spectrometry.

“There are so many methods to study proteins and other biomolecules,” she says. “Finding ways to combine them will tell us much more than using a single method alone.” She anticipates that it will become increasingly common for researchers to combine spatial proteomics with genetic-sequencing-based methods to get a fuller picture of single cells.

For Gagarinova, meanwhile, spatial proteomics will draw upon the skills of computational biologists. “There’s definitely been a demand for competent computational biologists,” she observes. “And the amount of data is growing. So, we’ll be looking at new ways to come up with new insights.” She highlights the role of new supercomputing and data storage facilities in Canada and elsewhere.

The workflow for spatial proteomics method, LOPIT
Figure 4. The workflow for spatial proteomics method, LOPIT (localization of proteins using isotope tagging). 1) Cells are gently lysed, so their subcellular compartments/organelles remain intact, and then partially separated from one another using a choice of centrifugation methods. 2) The amount of each protein in each fraction is measured using quantitative mass spectrometry. 3) Machine learning tools are applied to the mass spectrometry data to classify proteins into one or more subcellular compartment, and the data visualized using principal component analysis. [Image and caption provided by Kathryn Lilley, PhD]
Finally, the sky is the limit for applications of spatial proteomics beyond basic research. For example, the techniques of spatial proteomics can be used to study bacterial and viral pathogenesis. This possibility, Gagarinova points out, was emphasized in a report commissioned by the United Kingdom’s Prime Minister. According to this report, antibiotic-resistant bacteria will kill more people than cancer by 2050.4 This assertion, which was concerning when it was made in 2016, has become only more concerning since then. And bacteria are not the only pathogens to cause worry. “COVID-19,” Gagarinova remarks, “has brought the science [of viruses and other pathogens] to the forefront.”

She adds, “We need to keep in front of the next pandemic, as well as understanding the effect of pollution and environmental change on the human body. [With spatial proteomics], we can study [how cells change.]”



  1. Foster LJ, de Hoog CL, Zhang Y, et al. A Mammalian Organelle Map by Protein
    Correlation Profiling. Cell 2006; 125: 187–199.
  2. Dunkley TPJ, Hester S, Shadforth IP, et al. Mapping the Arabidopsis organelle
    proteome. Proc. Natl. Acad. Sci. USA 2006; 103(17): 6518–6523.
  3. Mulvey CM, Breckels LM, Crook OM, et al. Spatiotemporal proteomic profiling of the pro-inflammatory response to lipopolysaccharide in the THP-1 human leukaemia cell line. Nat. Commun. 2021; 12(1): 5773.
  4. O’Neill, J. Tackling Drug-Resistant Infections Globally: An Overview of Our Work. The Review on Antimicrobial Resistance. 2016.