April 15, 2005 (Vol. 25, No. 8)

Harnessing the Power of New High Throughput Systems and Effectively Managing the Data Produced

In this post-genomic era, drug discovery and other basic life science research sectors are increasingly utilizing nucleic acid microarrays for comprehensive gene expression studies.

Drug discovery labs depend on microarrays to help elucidate disease pathways, measure host response to pathogens, screen for targets, and even support preclinical and clinical testing of candidate therapeutics.

The power of expression analysis can speed the movement of compounds through the drug pipeline at several points, even including clinical trials where it might be employed to measure toxicity, evaluate efficacy, and monitor patient response to therapy.

However, the name of the genomics game isn’t merely sequencing genomes and collecting massive quantities of microarray data any more. The “sizzle” is now in the so-called transcriptome, i.e., the transcriptional-level expression phenotype during any given biological state of an organism.

Furthermore, increasingly sophisticated informatics and computational biology applications not only aid in achieving but in amplifying the findings of focused gene expression studies.

Genome-scale snapshots of the functional state of an organism are being captured by microarrays and then analyzed in silico by an increasingly interconnected network of scientists utilizing a rapidly growing shared database of genomic and pathway information.

The result is a dynamic understanding of organismal biology and a powerful force propelling drug discovery.

Knowledge Framework

Creating efficient, uniform databasing and datasharing systems is critical for a vibrant computational biology network. For example, Ingenuity Systems (www.ingenuity.com) has developed a structured knowledge frameworka concept that won the 1998 Stanford University’s BASES Entrepreneur’s Challenge.

Core data are harvested from peer-reviewed journal articles, curated by Ph.D. scientists extracting findings based on an extensive ontology that spans molecules (genes, proteins, small molecules), biological processes at the cellular level (DNA damage, cell differentiation, adhesion, etc.) and at the organismal level (hepatotoxicity, knock-out phenotypes, disease, etc.).

Ingenuity Pathways Analysis (IPA) is a product that provides scientists with a set of tools that helps users better understand biological mechanisms based on genomic and proteomic experiments. IPA is for scientists working across the life sciences from early research to development and clinical applications.

Megan Laurance, Ph.D., Ingenuity product research scientist, explains that “one of our customers, Paul Mischel, M.D., at UCLA, is using genome-wide expression analysis on patient samples to identify genes and proteins that are perturbed in glioblastoma. “Dr. Mischel’s goal is to identify new markers of glioblastoma, as well as novel therapeutic targets.”

(For a recent review see “DNA-microarray analysis of brain cancer: molecular classification for therapy,” Nat Rev Neurosci 10:782-92.)

According to Dr. Laurance, Dr. Mischel’s use of IPA highlighted a pathway that contained known markers and targets of glioblastoma. What was unexpected was that in the middle of the pathway a gene was found that had never before been implicated as a key player in glioblastoma.

“So he went from a large set of data to confirming various aspects of the glioblastoma disease model to discovering a new potential target rather quickly,” notes Dr. Laurance.

“Armed with practical mechanistic information about that pathway, he went back into the lab and used a variety of techniques (phosphotyrosine blots, RT-PCR, siRNA) to confirm the role of that gene in glioblastoma,” says Dr. Laurance.

Whole Genome Array

Affymetrix (www.affymetrix.com) is one of the pioneering leaders in the manufacture and application of high density microarrays for high throughput genomic analysis. The firm’s GeneChip Array technology enables scientists to connect disease states with genetic variations.

Recent analytical accomplishments include the elucidation of interactions between signaling pathways involved in development, the discovery of a new type of leukemia, and the development of new assays to track drug metabolism.

In a new collaboration with the NCI, Affymetrix is trying to build a whole-genome tiling array to interrogate the genome at resolutions approaching every nucleotide. Such a chip would have the potential to detect regions of transcription, transcription factor binding sites, sites of chromatin modification, sites of DNA methylation, and chromosomal origins of replication.

Detecting gene expression states across a genome can be critical for a complete picture about cell function. According to Affymetrix, it is estimated that between 0.2 to 10% of the 10,00020,000 mRNA species in a typical mammalian cell are differentially expressed between cancer and normal tissues.

Understanding the critical relative changes among all the genes in this set would be impossible without the use of whole-genome analysis.

Affymetrix arrays can also discern between alternatively spliced transcripts, an important capability because a large fraction of human mRNAs undergo alternative splicing. In addition, the balance between sensitivity and specificity can be easily adjusted to meet the particular requirements of a study by using the newly developed analysis software, Microarray Suite 5.0.

Furthermore, the design and manufacture of GeneChip probe arrays are highly stereotyped and consistent, eliminating the need to make arrays in individual labs, thereby significantly minimizing user setup time and providing a higher degree of reproducibility between experiments.

Several different methods can be used to measure gene expression. Those include Affymetrix-type oligonucleotide microarray chips, DNA-based microarrays, quantitative RT-PCR, and serial analysis of gene expression (SAGE).

Contract Research

Althea Technologies (www.altheatech.com) is a contract research laboratory specializing in services that accelerate gene-based drug development. Althea has developed eXpress Profiling, a patented process capable of high throughput, quantitative measurements of gene expression, according to the firm.

The eXpress Profiling method utilizes highly multiplexed PCR (20100 genes per analysis) to focus on the in-depth analysis of specific classes of genes and gene pathways. Althea’s clients might use this kind of data for assessing preclinical and clinical toxicity, drug efficacy, and for monitoring patient response to therapy.

Althea sees its approach as especially important for providing companion biomarkers that can enhance their customers’ understanding of how and why different patients respond positively or negatively to a new therapeutic.

Working with Beckman Coulter (www.beckmancoulter. com), they offer service and products where these focused gene expression assays can be used throughout drug development and then be taken directly into the translational testing space.

Genomatix (www.genomatix.

de) is in the process of launching two new software products: ChipInspector (CI) and BiblioSphere PathwayEdition (BSPE). ChipInspector applies genomic knowledge to Affymetrix CEL file data to extract an increased number of significant features while simultaneously reducing false positive rates by about an order of magnitude, explains Genomatix’ CEO and CTO, Thomas Werner, Ph.D.

The BiblioSphere Pathway-Edition allows selection of pathways most relevant to the individual experimental design, linking genes to disease mechanisms through microarray data in ways different than other pathway tools, he adds.

“ChipInspector uses a large database of alternative transcripts and promoters to achieve low signal-noise ratios in microarray analysis. More significant genes are detected and dump’ files are utilized. Their product can eliminate statistical and gene calling errors at the single probe level, and increase accuracy during significance analysis of Affymetrix-based microarray data,” says Dr. Werner.

Dhr, et al. [Nucleic Acids Res. 33, 864-872 (2005)] shows how the Genomatix technology can be used to link genomics, pathways, and diseases using a paradigm of adult-onset diabetes mellitus.

“This collaborative effort shows that multifactorial diseases can be tied to molecular networks in a systematic manner,” continues Dr. Werner. “[It further illustrates that] databases as public tools and resources lacked critical information.”

High Throughput Methods

Progress in the early discovery stages of genomics research largely depends on high throughput technologies, such as mass spectrometry or high density gene chips, which are well suited for sifting through massive amounts of data from the genomes of an organism.

A different set of tools, however, is needed by researchers to fine-tune research hypotheses and to determine how to apply medically relevant genomic information in clinical practice.

Nanogen (www.nanogen.com) workstations and microarrays are best known for their diagnostic applications. But Nanogen micro-array chips also serve a role in pharmaceutical development.

According to Graham Lidgard, Nanogen’s senior vp of R&D, “Nanogen’s system can be used for multiple DNA samples on a single array. The NanoChip is for the researcher who wants to focus on a smaller, concentrated panel.

“For example, using 10 SNPs on one pad on the chip would allow the researcher to test 400 different samples on one chip, providing higher throughput and utility for multiple samples.”

Looking to the future, success in scientific discovery and therapeutic advances will depend on our ability to effectively integrate a growing body of knowledge about regulatory networks and to apply existing molecular evidence into a systems biology approach to life science research.

Realizing the promise of the post-genomic era will increasingly depend on contemporaneously harnessing the power of new high throughput systems as well as effectively managing the data they produce, including giving any relevant scientist access to this (literally and figuratively) living body of data.

Previous articleBiotechs Focus on Critical Medical Needs
Next articleMetaNeuron Program