Integration of Data
Graphical visualization of a pathway is only the beginning of the integration trend, comments Jordan Stockton, Ph.D., marketing manager, informatics, Agilent Life Sciences and Chemical Analysis (www.chem.agilent.com). What everyone wants to see is the overlap of various types of measurements, such as gene expression, metabolic profiles, DNA-protein binding events, and chromatin remodeling. We know how to run these experiments, but the informatics is a real bottleneck. Software providers are still a step behind the demand for the bridging of these technologies.
Agilent provides instrumentation and tools for various types of genomic and proteomic analysis and integrates the findings via its GeneSpring Analysis platform. The workgroup-enabled component facilitates the exchange of the data between the users. The plug-in modules for GX (gene expression), GT (genotyping), CGH (comparative genomic hybridization), and MS (mass spectrometry) are able to analyze and cross-reference data on a large scale. All modules have similar interface, making it easier to learn the software.
There are a number of the reasonable home-brew analysis programs, contends Dr. Stockton. However, none of them measure up to the level of processing and integration that Agilent provides.
BioDiscovery (www.biodiscovery.com) helps companies integrate the data via the GeneDirector, a comprehensive microarray data-management solution. The enterprise software package provides a comprehensive solution for the microarray workflow process, starting from sample management and tracking, through automated image analysis and results generation. The program ensures high-quality data by using an Oracle-based data-management platform that maintains and enforces relationships among all data generated in the experiment.
BioDiscovery provides software modules compatible with most popular microarray instruments, such as Agilent’s and Affymetrix’(www.agilent.com), and plans to come out with CGH and MS tools in the near future. Many companies offer exploratory desktop analysis tools, says Soheil Shams, Ph.D., founder and president of BioDiscovery. It is a crowded space with limited potential for qualitative software improvement. In contrast, we come in with an infrastructure for systematic exploration using standardized quality control and analysis tools.
Our upcoming ARM (Array Result Manager) System provides a novel interface, enforcing analytical SOPs. A company will be able to perform routine data analysis according to their own SOPs and store, analyze, and retrieve the analysis results in a uniform, easily traceable, and compatible format.
BioDiscovery provides flexible licensing terms, depending on the number of the modules or on the in-house-derived analysis software. Many companies believe that they can write the best analysis algorithms themselves. You could say that if these algorithms were cars, we are providing the roads for them to ride on, adds Dr. Shams.
The key issue faced now by bioinformatics providers is integration of different algorithms and data in today’s bioinformatics environment, says Darryl Gietzen, Ph.D., product manager for bioinformatics at SciTegic (www.scitegic.com), a wholly owned subsidiary of AAccelrys (www.accelrys.com). The company’s Pipeline Pilot platform organizes streams of data coming from different sourcesmicroarray, sequence, chemistryand in different formatstext, database, binary, numerical.
Pipeline Pilot software enables processing, analysis, and mining of large volumes of data via a user-defined computational protocol. The data is piped in real time through a network of modular computational components. The data path can be easily changed by shuffling the computational components in the graphic interface.
Pipeline Pilot technology eliminates the boundaries of individual databases. For instance, a sequence query processed by Pipeline Pilot may contain the following modulesread Affymetrix ID, map to a gene ID, collect gene ontology (GO) information, collect KEGG information (Kyoto Encyclopedia of Genes and Genomes), map to a chromosome, map SNPs, retrieve gene sequence, perform a BLAST against a patent database, collect best hit, and create a cumulative report. The results of the query are summarized in customizable reports.
The web client for Pipeline Pilot enables anyone on the intranet to process his or her data through the predesigned pipelines. Accelrys is implementing Pipeline Pilot to manage and integrate the data flow through its numerous software products.
The Vector NTI software package from Invitrogen (www.invitrogen.com) is a well-recognized benchtop tool for DNA analysis and manipulation. Current research environment demands more and more integration. Software providers have to adapt their products to ensure smooth integration with the existing online tools, IT infrastructure, and data stream from various applications. Whatever the plans and results of the day-to-day bench work, they have to be managed by a single portal, says James Caffrey, Ph.D., marketing manager, bioinformatics.