Over the last few years, we have witnessed the rapid evolution of software tools for functional analysis of microarray gene expression, proteomics, metabolomics and other omics data. Integrated data-mining platforms, such as MetaCore and MetaDrug (GeneGo; www.genego.com), combine manually curated databases of protein interactions and pathways with sophisticated network-analysis tools, and are becoming the mainstream in drug discovery and life science research.
The networks and pathways are generated from subsets of high-fidelity binary protein-protein, protein-compound, and protein-DNA interactions collected in the database, followed by statistical analysis of their relevance to specific functional processes, diseases, and toxic categories. The subsets of affected proteins, genes, and metabolites are defined in the omics experiments, which typically deal with human tissue samples in different therapeutic areas: human cell lines and mouse and rat data in toxicogenomics and drug response.
Reliance on the backbone of high-fidelity interactions extracted from full-text, small-experiment articles is key in analysis of inherently error-prone omics data sets, which otherwise are poorly comparable. However, high quality comes at a cost of restricting the analysis to the subset of mammalian proteins (genes), whose function is experimentally proven and for which interactions are published in small-experiment literature. Such information is not yet available for almost half of human proteins (defined as mRNAs).
On the other hand, omics experiments, such as yeast-to-hybrid assays, co-expression, pull-down immunoprecipitation, ChiP/Chip, microRNA assays, and high-content screening, are a rich source of putative physical interactions and functional associations between uncharacterized and known proteins as well as novel interactions between known proteins. Large pools of such custom interactions are accumulated at drug companies and in the public domain. Integration, alignment, and prioritization of potentially IP-rich but low-trust omics data with high-trust small-experiment data is the subject of active research in functional data-mining and network analysis.
Here, we present three novel tools, which help analyze custom interactions within MetaCore and MetaDrug.
The general schema of integration and analysis of custom interactions is shown in Figure 1. The interactions can be visualized on networks using MetaLink™ and on static, interactive, canonical pathway maps using MapEditor and/or added to the underlying MetaCore interactions database using Pathway Editor. In all cases, the interactions themselves, as well as the derivative products (networks, pathway maps, and report tables), are accessible in secured user accounts and can be shared between individuals or a group of users.
Typically, interactions are not the only type of data analyzed by users, they accompany different molecular datasets such as the level of gene expression in microarray experiments, MS proteomics, or metabolomics concentration data. Therefore, the interaction tools work in sync with the MetaCore modules for visualization and statistical analysis of molecular data.