Perturbations and Pathways
“I was fascinated by processes that have evolved to create such amazingly precise spatiotemporal gene expression patterns, and I was even more astounded to realize that information underlying these patterns is encoded in the sequence,” explained Saeed Tavazoie, Ph.D., professor of biochemistry and molecular biophysics. A major effort in Dr. Tavazoie’s lab has focused on the concept of predicting gene expression dynamics from information encoded in the sequence.
Decades ago, Dr. Tavazoie’s approach would have been difficult to implement. However, the advent of microarray technology, which allows the expression of thousands of genes to be surveyed simultaneously under a broad range of conditions, has been instrumental toward making this approach become reality.
Previously, Dr. Tavazoie and colleagues developed a computational pipeline to profile a number of gene expression perturbations occurring in a large number of different cellular states, and the genes were subsequently clustered into co-expression modules. “We then looked at the regulatory regions for the occurrence of de novo motifs that are enriched in these genes with regard to the background of the genome,” said Dr. Tavazoie.
These motifs were predicted to function as transcriptional regulator binding sites, and the strategy provided a powerful approach to perform reverse engineering in simple organisms. “However, going from yeast to humans has been more challenging, due to the scale and the complexity of the human genome,” reported Dr. Tavazoie.
To address this challenge, Dr. Tavazoie’s lab developed new algorithms based on information theory that enable sensitive and specific detection of transcriptional and post-transcriptional regulatory elements within the human genome. In particular, TEISER (Tool for Eliciting Informative Structural Elements in RNA) is a new framework that enables the discovery of structural RNA elements by using context-free grammars and mutual information. “The application of TEISER to mammalian datasets is revealing a rich picture of mRNA stability regulation by these elements and the RNA-binding proteins that bind them,” said Dr. Tavazoie.
In a recent genome-wide, systems-level analysis of 46 different cancers, Dr. Tavazoie and colleagues used cancer gene expression datasets to identify known pathways and processes that are perturbed. “We have identified some of the most commonly recurring elements in several cancers, and this allowed us to dig deeper into individual functional categories and pathways, such as those involved in regulating apoptosis or the mitotic cell cycle,” Dr. Tavazoie added.
Quantitative gene expression measurements and pathway analyses on these data also allowed causality between perturbations and the pathways that are changed to be explored. This approach revealed that, as opposed to a “universal” signature for tumor pathways, perturbed pathways are very diverse, and cellular modifications underlying the malignant state are broadly heterogeneous.
Additionally, the systematic analysis of cis-regulatory elements showed that only approximately 25% of the newly discovered motifs corresponded to known binding sites, illustrating the complexity of the malignant state and the limited amount of information that we currently have on describing cellular perturbations that characterize it.