If a picture is worth a thousand words, cell imaging may be worth a billion. The Imaging Platform at the Broad Institute of MIT and Harvard, along with a dozen non-profit, biotechnology, and pharmaceutical partners, recently announced a collaboration to create a cell-imaging dataset, displaying more than one billion cells responding to more than 140,000 small molecules and genetic perturbations.
Understanding cellular structures and dynamic processes can be critical in the study of cell and developmental biology, neuroscience, and many other fields. With the newly announced dataset, researchers will be able to predict a compound’s activity and toxicity, match drugs to different diseases, identify a drug’s mechanism, and discover and develop new therapeutics.
The twelve partners who will be involved are called the Joint Undertaking in Morphological Profiling with Cell Painting (JUMP-CP) and include:
- Janssen Pharmaceutica NV
- Max Planck Institute of Molecular Physiology
- Merck KGaA, Darmstadt, Germany
- Takeda Pharmaceutical
The Broad Institute’s flagship assay, Cell Painting, which was developed in 2013, will be used to survey cell morphology. It is a morphological profiling assay that multiplexes six fluorescent dyes, imaged in five channels, to reveal eight cellular components or organelles. Metrics extracted from cellular images will capture information to mark how the cell was affected by each chemical compound or genetic perturbation, allowing scientists to explore the underlying biology with the power of automated image analysis.
The initiative received funding through the Massachusetts Life Sciences Center (MLSC) Bits to Bytes capital program. The MLSC funded approximately $750,000 per project, for a total of $6.7 million. The Bits to Bytes capital program aims to drive life science innovation in Massachusetts in the area of big data by providing open access to data, enabling multiple groups to leverage the data, which will, in turn, accelerate discovery and fill an existing funding gap for biomedical research.
The Cell Imaging Consortium will be led by Carpenter, who is senior director of the Imaging Platform at the Broad Institute of MIT and Harvard, as well as an institute scientist. After earning her PhD in cell biology from the University of Illinois, Urbana-Champaign, Carpenter completed her postdoctoral fellowship at the Whitehead Institute for Biomedical Research and MIT’s Computer Sciences/Artificial Intelligence Laboratory. Her lab develops algorithms and strategies for large-scale experiments involving images. A major focus is now translating messy bottlenecks in the drug discovery process into solvable data problems.
Carpenter recently told GEN about the inspiration, strategy, and hopes behind the launch of the consortium.
GEN: What was the inspiration or motivation for this consortium?
Carpenter: There has been growing curiosity in the pharmaceutical industry about whether images could be a powerful data source, so several key leaders who had launched internal experiments began discussing best practices amongst each other informally. When MLSC launched their Bits to Bytes call for proposals, we rapidly converted those discussions into an integrated project plan and brought additional industry partners into the effort.
GEN: This is a very unique collaboration between academia and industry. How were these 12 partners chosen? What specific resources will these partners bring?
Carpenter: Many of the companies had already been discussing Cell Painting best practices together and thus were natural partners to invite to the initial grant-writing effort. Once the grant was awarded, we contacted as many colleagues as we could in the field, and mentioned the consortium in presentations at conferences, to make sure the project would be as open as possible.
Given the broad array of fields and expertise involved in the assay (from cell biology to microscopy to chemistry to data science), the partners offer diverse expertise and experience, in addition to the actual resources needed to produce such a large dataset. Each partner will be contributing support for a Broad data scientist on the project in addition to producing replicates of the data on-site. The technical variations among the sites will produce data with the diversity we need to work out normalization methods, such that future laboratories’ data can be matched to the public dataset.
GEN: Have you modeled this partnership on any other academia-industry consortia that you or the Broad have been involved in?
Carpenter: The Broad is a very collaborative institute so it is not unusual for us to bring together diverse partners with shared interests in technology. Ultimately our mission is to push biomedicine forward, so we are really motivated to organize industrial collaborators around testing and implementing technologies that produce medicines faster. The DepMap consortium is one academic-industry collaboration that served as a model as we launched this project.
GEN: What are the metrics that will tell you whether this program has been a success?
Carpenter: Whether data is useful to the scientific community can be judged by how many scientific groups access and use the data for their research. Some subset of scientific advancements will be captured by the number and kinds of publications that make serious use of the data. I’m especially excited to see researchers combine this imaging data with other data types, such as genomic, transcriptomic, and assay data. We know of at least a dozen different kinds of experiments that can use image-based profiles—and, this being the largest public perturbation-based Cell Painting dataset, perhaps there will be entirely new applications researchers devise—that would be especially rewarding to see. Of course, a crucial subset of the use of this data will happen in the pharma industry as they use Cell Painting data in multiple steps of the drug discovery process. The victories there, we may only hear about anecdotally or with a delay of some years as medicines are discovered.
GEN: Once the image dataset is created, it will be available to consortium members for the first year, after which it will be freely available for use by the scientific community. Why is there that embargo period rather than it being available to everyone right away?
Carpenter: The one year of data exclusivity is one of the major benefits to the consortium members, who will have put in a lot of effort to design the project overall and to optimize and validate the experimental parameters (in addition to creating the data). The consortium team will use the year to carry out interesting analyses together, in pre-competitive fashion, and publish papers to share results and insights with the community.
The data have already yielded results, as four drugs are entering clinical trials from Recursion Pharmaceuticals. The scientific community has also expressed their excitement about this initiative. For example, David Gaboriau, PhD, a FILM microscopy specialist at Imperial College told GEN:
“This large scale initiative with multiple partners from the pharmaceutical and biotechnology industry is exactly what is needed to enable new discoveries. Multi-feature analysis of microscopy images of cells which have been exposed to large numbers of compounds or perturbations is the future of drug discovery. Additionally, the fact that the dataset will be freely available to academic researchers after a year will generate new breakthroughs in multiple disease areas. ”
The cell imaging consortium’s new approach at drug discovery holds promise for alleviating bottlenecks in the typical pharmaceutical pipeline. By providing the scientific community with the ability to quickly identify the impact of therapeutics, and combining industry and academic innovation, drug discovery and development is sure to accelerate and innovate future drug discovery processes and thinking.