Biobanks are an important resource in the elucidation of the molecular mechanisms of diseases—such as interactions between genes, proteins, and epigenetic modifications—and they provide enormous opportunities for researchers worldwide to explore census-like multisource healthcare data. The accessibility, management, and analysis of many high-quality biospecimens is the foundation for important research aimed at improving precision medicine.
At the heart of biobank-driven research is the ability of industrial and academic innovators to easily access biospecimens and associated data with confidence from a network of trusted suppliers. Along these lines, Medicines Discovery Catapult (MDC)—a not-for-profit organization funded by the U.K. government—recently announced the rapid expansion of its Biosamples supply network, creating one of the world’s largest virtual biobanks accessible to life sciences innovators. The rapid expansion of the virtual network to over 330 clinical sites and 1.5 million banked biosamples increases the scale, range, and types of clinical samples available to U.K. innovators in drug discovery and diagnostics.
Peter Simpson, PhD, CSO, MDC, says, “Despite millions of U.K. patients donating samples for scientific research, access remains a barrier to R&D productivity. At MDC, we are expanding our tissue samples network further because we are committed to ensuring that life sciences organizations can readily access the samples they need to conduct experiments that are vital for the discovery of new medicines.”
This network has already been employed to support the United Kingdom’s response to coronavirus, providing diagnostic innovators, researchers, and industry grant applicants with ethical routes to biosamples not previously available to many in the United Kingdom. MDC plans to extend this approach across all disease areas to rapidly accelerate drug discovery and diagnostic projects.
Biobanking is getting to a point where researchers need more than just access to samples and data from biobanks. John Ellithorpe, PhD, president and chief product officer, DNAnexus, thinks that innovation depends on management of large-scale data and digital environments for analysis.
“To democratize the science, we need to provide environments to do work on these large-scale data,” says Ellithorpe. “DNAnexus is providing a research platform that allows you to manage data, and because the data is so large and complex, we bring the methods to the data and provide the analysis capabilities. We are a big believer in enabling data analysis and the translation of research into treatments and diagnoses.”
Ellithorpe stipulates that there are challenges to providing digital environments to manage and analyze biobank data, challenges that include sheer size. For example, over the next five years, the UK Biobank’s data will grow to 15 petabytes—equaling the amount of data created annually by the Large Hadron Collider.
“There’s also a lot of complexity to standardizing what data is so that it’s reliable from a research standpoint,” notes Ellithorpe “There’s not one data set that will solve every problem, and as a result, there’s going to be a need for a lot more controlled and responsible sharing of data so that people can research and answer questions to improve healthcare.
“For example, let’s say you want to validate some interesting drug target that you may have in your own data set, but then you want to compare it to another data set. You need to make sure that you’re speaking the same language and understanding the same semantics.”
Data management and methodology trends
There are many biobanks already out there that cover clinical and even genomic data sets, but there are few publicly available resources that have sufficient omics data sets regarding transcriptomes, proteomes, and metabolomes. For this reason, Evotec has been continuously investing in building up its own biobanks in collaboration with academic partners to access clinical patient data sets and provide patient samples for omics analysis.
“We believe in a long-term strategy where we stratify patient populations, not according to symptoms, but rather by molecular data sets that pinpoint disease-relevant mechanisms, actions, and perturbations on the transcriptome, proteome, and metabome levels,” says Cord E. Dohrmann, PhD, CSO, Evotec. “In the future, diseases will probably be categorized via molecular mechanisms that have gone awry rather than symptoms.”
Dohrmann puts this in the context of personalized medicine. He thinks that if we don’t understand disease molecular mechanisms, it’s going to be difficult to design and select the next generation of drug targets and intervene medically in a more precise manner.
“Without access to these kinds of databases and tools to analyze these kinds of omics data sets, you are essentially flying blind in terms of understanding disease mechanisms and being able to select the next generation of drug discovery projects,” insists Dohrmann. “The more you focus on complex omics-driven disease signatures, you should be able to identify and then use disease signatures to monitor disease progression regarding potential drug targets, compounds, or drug candidates and monitor their potential efficacy and safety profiles.”
Dohrmann thinks that one of the key challenges is that many of these methods are not currently designed for higher throughput. Along these lines, he proposes that the biggest area for growth is making omics a mainstay in the analysis of patient tissues and patient samples—ideally at the single-cell level. This requires further improvement of omics technologies in terms of robustness and throughput. To address this, Evotec is building platforms, especially in the transcriptome and proteome fields, to enable higher throughput in a robust fashion.
Among the many lessons learned from the COVID-19 pandemic is that the world remains dangerously exposed to novel and unknown health risks. Biogen, along with the Broad Institute of MIT and Harvard, and Partners HealthCare, announced earlier this year a consortium that will build and share a COVID-19 biobank. The biobank will help scientists study a large collection of de-identified biological and medical data to advance knowledge and search for potential vaccines and treatments.
Biogen will help employees who wish to volunteer and connect with the project. The volunteers are among the first people in Massachusetts to be diagnosed with and recover from COVID-19, as well as close contacts of those individuals, including people who were not tested or who may not have had symptoms.
“The COVID-19 pandemic has had a very direct, very personal impact on our Biogen community,” said Maha Radhakrishnan, MD, chief medical officer at Biogen. “We are uniquely positioned to contribute to advancing COVID-19 science in an organized and deliberate way so we can all gain a better understanding of this virus. Many Biogen colleagues have been eager to find ways to help others during this pandemic, and it is our hope that this biobank will provide hope and essential information during this difficult time. It is an opportunity to activate and bring together our commitment to science with the needs of humanity, and we are proud to participate.”
Dependable biobank samples, data, and analytics tools can help scientists develop faster responses not only for global pandemics but also for long-term improvements in patient outcomes and quality of life for millions suffering from debilitating diseases. Ideally, there will be a lot more collaboration around the world to make sure that the organizations that need biospecimens and data can interact with those that manage these resources and really advance biomedical research.