Researchers at the Norwich, U.K.-based Genome Analysis Center (TGAC) say they have developed a unique bioinformatics approach for identifying associations between molecules from a range of vast data sources. Their report (“ONION: Functional Approach for Integration of Lipidomics and Transcriptomics Data”), published in PLOS1, described studies that focused on measuring metabolism in tissues under varying conditions, e.g. genetics, diets, and environment.
Opposed to current methods that apply statistical analysis to data sets as a whole, the proposed workflow breaks the initial data into smaller groups determined by known molecular interactions. Statistical methods can then be applied to these groups resulting in more accurate results than if the analysis had been applied to the whole dataset. This technique has been shown to improve the detection of genes related to lipid metabolism on an example mouse nutritional study that increases our understanding of biochemical fluctuations by 15%, according to the scientists.
Identifying associations between metabolites and genes is crucial to understanding processes in the cell. However, uncovering these relationships is a complex task, especially when integrating data that concerns various types of molecules. Adding to this complexity is the vast quantity of data available for analysis, a result of the development of new experimental high-throughput techniques.
Initially, the molecular workflow will be applied to research into the benefits of broccoli for prostate cancer, in collaboration with the Institute of Food Research. As well as being applied to studying the health benefits of flavonoids, which are plant metabolites found in a variety of fruits and vegetables, in collaboration with the University of East Anglia.
“By improving our capability to integrate data from various sources and identify links between metabolites and genes, this workflow will provide a more detailed diagnosis of cellular metabolism and gene expression in biological processes,” said study co-author Wiktor Jurkowski, Ph.D., integrative genomics group leader at TGAC. “Knowledge gathered in molecular networks can be harnessed to improve data integration and interpretation.”
Dr. Jurkowski explained that his team’s approach, integrating transcriptomics and metabolomics data, will help interpret signals measured by omics techniques to extend our knowledge of processes under specific biological conditions.
“This will benefit biologists in interpreting data, creating better hypothesizes, and pinpointing genes and metabolites involved to unravel the mechanism of interest,” he continued. “This is a proof-of-concept study and we are currently working towards improving the group generation strategy for spare areas of the interactome and less annotated species. We are applying this and other molecular network approaches to data generated in collaborative projects across Norwich Research Park.”