GEN Exclusives

More »

Feature Articles

More »
Apr 15, 2010 (Vol. 30, No. 8)

New Tools for Dealing with Data Overload

Advanced Solutions Help Organize and Optimize Vast Amounts of Information

  • Agent-Based Modeling

    Agent-based modeling is an approach that can be used to develop hypotheses and make predictions about a biological system. Like many of the tools of systems biology, it benefits from a multiscale approach. However, this creates an extremely dense dataset that can be unwieldy and difficult to integrate into a functional workflow.

    Thomas Deisboeck, M.D., associate professor, radiology, Massachusetts General Hospital, Harvard Medical School, will present his work on a hybrid discrete and continuum agent-based cancer model. It simulates each cancer cell, equipped with cell-signaling pathways, and the cells interact on three-dimensional lattices that resemble microenvironments in tissue. This cross-scale technique allows validation of biomarkers and discovery of novel molecular targets.

    Dr. Deisboeck has been able to observe in silico patterns that emerge in multicellular populations over time, in brain and lung cancer, which can then be compared to histology samples and patient imaging data. Using a patient’s MRI as a starting point, Dr. Deisboeck simulates tumor growth, and has begun to predict patient-specific cancer progression. “We were able to predict, in a patient case study, tumor recurrence earlier than it was visible on MRI,” says Dr. Deisboeck, referring to a retrospective case study of an individual with a brain tumor.

    Normally, the resolution limits of conventional imaging technology would not allow physicians to study the cancer on a single-cell level. Dr. Deisboeck’s agent-based modeling uses simulation techniques to push the resolution beyond the natural limits of the instrument, providing a personalized computational model of an individual cancer. Another limitation exists on the computational side of the experiment, where extremely dense datasets strain the capacity of the systems.

    While still at a nascent stage, Dr. Deisboeck notes that the use of multiscale, multiresolution modeling will allow the scientists to simulate selectively at various levels of granularity, choosing higher resolution for areas of interest such as the margins of the tumor where growth and invasion are more likely to occur and lower resolution for areas that are less likely to change quickly such as the center of the tumor. This makes the dataset more manageable and less computationally costly in an effort to simulate progression across multiple scales up to clinically relevant tumor sizes.

  • Knowledge Management

    The presentation by Yuri Nikolsky, Ph.D., CEO of GeneGo, will focus on systems for knowledge management in large organizations. Pharmaceutical companies often have disparate databases as a result of acquiring many smaller companies, having many locations, or having many databases. Scientists may all be doing the same thing, but using slightly different words and organizational schemes.

    Bringing all of that data together requires the construction of a knowledge base and a common ontology. GeneGo specializes in creating these systems for an organization using an underlying database called Metabase, which is its infrastructure. GeneGo manually exchanges the contents and classifies data into ontologies, which function like file folders in the system. The company uses a controlled vocabulary to standardize the data from different sources and to make data and metadata easier to find. This is done by creating synonym libraries, which can contain as many as three million entries.

    “GeneGo is working on a lot of ontology projects for pharma where we help them with controlled vocabulary. For example, if someone calls something a Granny Smith and another one a Golden Delicious, we call them apples and add them to the database,” says Julie Bryant, vp, business development and sales. All of this structure overlays the company’s original infrastructure—rather than replacing it completely—helping to preserve the functionality that was there originally.

    In addition to remodeling a company’s data infrastructure, the Metabase knowledge-management system is useful for creating open-access data repositories for sharing between institutions. Some of GeneGo’s current customer projects are open in nature. “Several pharmaceutical companies have come to us and said, ‘We want to put things in an open forum. So,  we want the ontologies you’ve built to be industry standard and put them out there in the public domain,’” says Bryant.

    Newer technologies for data integration and analysis emphasize sleek solutions that fit into an existing workflow, rather than replace it. Significant advances include methods for removing unnecessary information from large datasets to make them more compact and tools that make it easier to obtain and share information. Advanced data-integration and analysis tools make it possible to leverage the full power of a high-content screen or a database and aid in hypothesis development, decision-making, and prediction.



Related content

Jobs

GEN Jobs powered by HireLifeScience.com connects you directly to employers in pharma, biotech, and the life sciences. View 40 to 50 fresh job postings daily or search for employment opportunities including those in R&D, clinical research, QA/QC, biomanufacturing, and regulatory affairs.
 Searching...
More »

GEN Poll

More » Poll Results »

Stopping Research Fraud

What is the best approach to curbing scientific misconduct and outright fraud?