October 15, 2012 (Vol. 32, No. 18)

Richard A. A. Stein M.D., Ph.D.

RNA has emerged as one of the most fascinating molecules in biology. As the Human Genome Project revealed that the number of genes in humans, lower than predicted, is comparable to that from Arabidopsis thaliana and Caenorhabditis elegans, and alternative splicing together with complex combinatorial transcriptional regulation took center stage as fundamental mechanisms to diversify the proteome, significant resources started to focus on the previously so-called “junk DNA”, which encompasses the majority of the human genome and is transcribed into noncoding RNA.

One class of these noncoding RNA molecules, microRNAs, recently emerged as key post-transcriptional regulators in physiological and pathological contexts.

“The work on this topic has been exciting and encouraging, and the field as a whole will be shaped as people understand the relationship between microRNAs and their targets,” says Andrew Z. Fire, Ph.D., professor of pathology and genetics at Stanford University School of Medicine and co-recipient of the 2006 Nobel Prize in Physiology or Medicine.

One of the perceived challenges in studying microRNAs is that each microRNA can apparently regulate up to hundreds of target protein-coding genes, with each target gene potentially regulated by multiple different microRNAs. In this context, the almost 2,000 microRNAs described to date in humans, some of which are present at cellular concentrations that vary by four orders of magnitude, have been viewed as part of an extremely complex and dynamic network.

“The question about the relationship between microRNA molecules and their targets is one that frustrates everyone, as for certain microRNAs it is hard to identify and study their definitive targets, but for others these have been relatively well characterized, in terms of either a broad set of targets that are regulated at modest levels or a few targets that are regulated at substantial levels,” explains Dr. Fire.

Either way, miRNAs are a key family of regulators, and understanding the source of diverse miRNA populations has become a critical part of understanding gene regulation. A recent project in Dr. Fire’s lab tested whether microRNA molecules with no direct genome match could be produced by RNA splicing.

The investigators generated intron-interrupted variants of the Caenorhabditis elegans lin-4 gene, encoding the first microRNA molecule that was discovered and, almost two decades ago, shown to be essential for the temporal control of postembryonic development.

“The intron that we used is a typical intron that a typical mRNA would use,” says Huibin Zhang, a recent Ph.D. graduate in genetics at Stanford University School of Medicine and lead author of the study.

The possibility of processing functional metazoan microRNAs by splicing an intron-interrupted precursor has multiple implications. One of them, the potential requirement for splicing for the in vivo biogenesis of certain microRNAs, would point toward an additional layer in the cellular gene expression regulatory networks.

“This also provides the ability to engineer a microRNA gene and track its expression a little more carefully, as it has to be spliced,” says Dr. Fire. The involvement of splicing in microRNA biogenesis could also help better understand factors that shape splicing in various species.

“The $64,000 question remains the one of target specificity. Once people identify the targets and learn how specific microRNAs interact with them, a lot of things will move forward, and this includes learning why a microRNA could be synthesized with or without introns in terms of the information content involved,” explains Dr. Fire.

Exploring Noncoding Genes

Central features that distinguish the biology of RNA from that of DNA are the decreased stability and increased fragility and structural dynamics that characterize RNA. Proteins that bind RNA to form ribonucleoprotein complexes are indispensable for its stability, structure, and function, and play central roles in its remodeling, an aspect that historically has made RNA biology a more challenging field experimentally.

“Only 10 years ago very little was known about the interactions between proteins and RNA in live cells. Most biochemistry was done out of the cellular context using assembled complexes and reporter constructs, and we could only guess what is going on in live cells,” says Jernej Ule, Ph.D., group leader at the Medical Research Council Laboratory of Molecular Biology.

Dr. Ule and colleagues originally developed a method known as CLIP (ultraviolet cross-linking and immunoprecipitation), which maps the RNA sites that directly contact proteins in vivo, on a genome-wide scale. Recently, Dr. Ule’s research group developed a new approach, iCLIP (individual nucleotide-resolution CLIP). iCLIP is based on the fact that reverse transcription most often stops at the crosslink site, and therefore sequencing of the truncated cDNAs provides information at nucleotide resolution.

The investigators used iCLIP to examine the in vivo binding of heterogeneous nuclear ribonucleoprotein (hnRNP) particles to the nascent transcripts. Powered to provide information at the nucleotide-level resolution, iCLIP revealed that hnRNP C binds uridine tracts, but shows a decreased binding at splice sites, pointing toward its importance in maintaining splicing fidelity.

Understanding the biology of RNA-protein interactions is an area of clinical interest, as many disease-causing mutations interrupt the function of ribonucleoprotein particles and, in this context, a focus on coding as well as noncoding RNA will be crucial.

“It is now clear that a very large proportion of mutations occur in the noncoding regions of the genome, and on the other hand most studies that tried to find disease-causing mutations have focused on the coding parts of the genome,” says Dr. Ule.

The structural and functional characterization of these regions will help define their involvement in disease. “In addition to the noncoding RNA molecules, even mRNAs contain noncoding regions, and all these harbor important regulatory sequences with relevance for disease,” explains Dr. Ule. Focusing on noncoding RNA and its involvement in disease pathogenesis promises to fill an important gap in the field.

Alternative Splicing in Alzheimer’s

According to recent estimates, 35 million individuals globally have Alzheimer’s disease, and over 4 million are newly diagnosed annually. Relatively little is known about the etiopathogenesis of neurodegenerative conditions in general, but over 90% of the patients do not appear to harbor inherited disease-causing mutations.

At the same time, an increasing body of data implicates the involvement of alternative splicing, which is implicated in many diseases. Estimates that at least 30% of disease-causing mutations affect splicing, together with the recent finding that over 95% of the multiexon human genes undergo alternative splicing, point toward the clinical importance of this process.

“The question that we wanted to ask is whether alternative splicing could be involved, and this was scarcely asked before, simply because the technology was not available,” says Hermona Soreq, Ph.D., professor of molecular neurobiology at the Hebrew University of Jerusalem.

To gain insights into molecular changes that could shape the etiopathogenesis of Alzheimer’s disease, Dr. Soreq and colleagues used a technology that specifically interrogates alternative splicing, and comparatively examined transcripts from the cortical region of the brain in individuals with this condition and in matched control adults. In addition to finding genes whose expression was upregulated or downregulated, the investigators made an additional discovery.

“We found that the RNAs for approximately 400 genes were neither up- nor downregulated, but that their composition was changed by alternative splicing,” reveals Dr. Soreq. Approaches that only measure gene expression levels would not have captured this set of genes. In most instances, the changes did not occur randomly but reflected the inclusion of a gene fragment that was not expressed in the normal brain, as a result of modifications in a family of heteronuclear ribonucleoproteins that function as splicing regulators.

By using biochemical approaches in brain sections, Dr. Soreq and colleagues revealed that these proteins were missing. This was accompanied by corresponding changes in the recently discovered family of regulator microRNAs, and the changes were very specific for Alzheimer’s disease, as they were not detected in patients with other conditions such as Parkinson’s disease or epilepsy.

Mimicking this modification in cultured neurons led to a loss of synapses, and the same changes introduced into the brains of live mice led to impairments in their learning capacities. “Altogether, these findings are pointing toward a new candidate for therapeutics that nobody knew about before,” emphasizes Dr. Soreq.


Researchers at the Hebrew University of Jerusalem found that RNAs for about 400 genes were changed by alternative splicing in the brains of Alzheimer’s patients. Further investigation revealed that certain splicing regulator proteins were missing in these individuals. The team believes the findings point toward a new therapeutic candidate for the disease. [Peter Maszlen/Fotolia]

The Mystery of Alternative Transcripts

“Several years ago, we made the interesting finding that a large fraction of alternative transcripts in human cells are not translated to proteins but, instead, have early stop codons that should target them for degradation by nonsense-mediated mRNA decay,” says Steven E. Brenner, Ph.D., professor of plant and molecular biology at the University of California, Berkeley.

Nonsense-mediated mRNA decay, an mRNA surveillance pathway, is a quality-control mechanism described in all eukaryotes studied to date that was once thought to predominantly function to selectively degrade mRNA molecules harboring premature stop codons, to prevent the expression of truncated proteins. However, as Dr. Brenner and colleagues revealed, a large number of natural isoforms are also targets for this pathway. “We are trying to understand why it is that our cells are making thousands of these alternative transcripts that only get degraded,” explains Dr. Brenner.

While examining the human genes that encode SR splicing regulators, a family of structurally and phylogenetically related positive regulators of constitutive alternative splicing, Dr. Brenner and colleagues found that all the family members were alternatively spliced. In several of these, the alternatively spliced forms harbored highly conserved elements within “poison cassette exons”, which contain premature in-frame stop codons. Genes encoding other SR family members were undergoing alternative splicing in their 3’ untranslated regions. These changes were predicted to target the respective mRNA molecules for degradation via the nonsense-mediated mRNA decay pathway. These unproductive splicing events were also described in the mouse orthologues of the respective human genes, and a distinguishing feature of the alternatively spliced elements is that they are among the most highly conserved regions between the human and the mouse genomes. “It is still a mystery what accounts for such an exceptionally high level of conservation, beyond the fact that we know that it is spatially associated with the splicing that causes the transcript to be degraded,” says Dr. Brenner. 

Predicting Good RNAi Sites

RNA interference, the inhibition of gene expression by small double-stranded RNA, has recently emerged as a powerful strategy to dissect gene function and develop therapeutic interventions. At the technical level, one of the challenges has been to identify the specific RNA sites that are targetable.

“There are many prediction algorithms that were developed to do this, but none of them work very well,” says Kevin M. Weeks, professor of chemistry at the University of North Carolina at Chapel Hill.

To map RNA structures and predict sites that are most available for interactions, including sites that can be successfully targeted for therapeutic interventions, investigators in Dr. Weeks’ lab pioneered several years ago a technology known as SHAPE (selective 2´-hydroxyl acylation analyzed by primer extension).

“We focused on the naive idea that one of the main challenges in predicting good RNA interference sites is identifying the parts of the HIV genome that are free to interact with the RNAi machinery. The simple idea was that conformationally flexible regions might provide particularly good targets,” explains Dr. Weeks.

These sites are mostly those that do not have internal structures, and do not interact with themselves, whereas places where RNA folds back on itself are more hidden and less likely to represent optimal targets. SHAPE takes advantage of the fact that the nucleophilic reactivity of the ribose 2´-hydroxyl group is very sensitive to the nucleotide conformation and is modulated by its flexibility in the RNA backbone, and provides a powerful tool that allows RNA structure and dynamics to be examined under a broad range of biological environments.

“The key part of this work is that not only were we able to make good predictions, perhaps the best to date, but our calculations suggested that only approximately 2% of the HIV genome is a good target for the RNAi machinery, and we were able to identify a significant part of these targets,” says Dr. Weeks.

This approach, which can be used not only for additional pathogens, but also in other instances that require specific interactions with a target RNA, represents a promising strategy at the interface between RNA biology and therapeutics.

New Tool for Cancer Research

“For years, we identified splicing changes inside cancer cells, but it was debated whether they have causative roles or whether they are the consequence of disease-related deregulation,” says Michael C. Ryan, Ph.D., president and bioinformatics specialist at In Silico Solutions.

The general interest in alternative splicing emerged from the concept that, instead of mapping specific genes to a single protein, multiple different protein products can be formed, in a spatial and temporal manner, a process that required the “one gene, one polypeptide” concept to be revisited. Alternative splicing is one strategy to expand the proteome diversity and it has additional roles, such as protein quality control.

At the interface between alternative splicing and cancer research, an area of increasing interest has recently focused on a set of biological programs that is active and establishes specific splicing patterns during distinct stages of growth and development, but is turned off later on, in adult cells. “Some cancer cells appear to be able to find ways to turn these programs back on, and reactivate embryonic versions of the genes,” says Dr. Ryan.

A few years ago, alternative splicing was mostly studied by using microarrays. “While this provided interesting insights, next-generation sequencing currently offers a resolution that is an order of magnitude better,” explains Dr. Ryan.

To provide investigators with a better platform to examine and interpret alternative splicing patterns from RNA-Seq reads, Dr. Ryan and colleagues developed SpliceSeq, a free resource that is powered to capture changes during alternative splicing and explore their functional consequences.

“With SpliceSeq, one can map the reads to each individual exon or splice, so instead of examining the read for each gene, we can individually look at each element of a gene,” says Dr. Ryan. The platform opens the additional possibility to identify splicing patterns across multiple samples, perform comparative analyses, or group samples based on specific criteria, increasing the power of the analysis.

“Moreover, we can translate the reads into protein sequence prediction and subsequently identify the portion of the protein that is being impacted by alternative splicing,” explains Dr. Ryan.

Dr. Ryan and colleagues are currently comparing alternative splicing patterns across The Cancer Genome Atlas (TCGA) data, one of the biggest RNA-Seq repositories. “For the first time, SpliceSeq provides the opportunity to visualize changes at an unprecedented resolution,” explains Dr. Ryan.

Discoveries linked to the biology of RNA shaped seminal biomedical advances. One of the most intriguing aspects of RNA is its ability to fulfill diverse cellular functions, which include informational, structural, catalytic, and regulatory roles.

Many advances in this field, in addition to helping overturn old concepts and opening new perspectives, point toward a much more complex picture, one that unveils the dynamic nature of the scientific inquiry, which Carl Sagan so vividly captured in words: “There is much that science doesn’t understand, many mysteries still to be resolved. We are constantly stumbling on surprises.”


In Silico Solutions’ SpliceSeq captures changes during alternative splicing and explores their functional consequences. A SpliceSeq display for a splicing event in the SLC25A3 gene detected in a comparison of RNASeq data from brain and heart tissue is shown.

Molecular Basis for Schizophrenia

Even though the genetic basis of schizophrenia has been extensively studied, a detailed understanding of this condition has been a challenging aspect that has slowed the translation of research findings toward clinical applications. However, new gene expression profiling platforms promise to provide better insights into the pathophysiological mechanisms of disease.

“We focused on the transcriptome from the superior temporal gyrus, and compared it among patients and healthy controls,” says Murray J. Cairns, Ph.D., senior lecturer at the School of Biomedical Sciences and Pharmacy, The University of Newcastle, and senior research fellow at the Schizophrenia Research Institute in Sydney. The cortical gray matter of the superior temporal gyrus is a particularly relevant area to focus on, because it contains the primary and secondary auditory cortex, and this region has been linked to auditory hallucinations.

Dr. Cairns and colleagues analyzed RNA-Seq data originating from 76 base-pair reads that achieved a 21-fold median transcript coverage. The team found that over 2,000 genes exhibited alternative promoter usage, and over 1,000 genes showed differences in alternative splicing. One of these, PLP1, encodes a transmembrane proteolipid protein that is one of the major components of myelin, which has long been implicated in this condition.

Overall, the analysis pointed toward modifications in three gene clusters, encoding proteins involved in neurotransmission, synaptic vesicle trafficking, and development. The sequencing strategy that Dr. Cairns and colleagues used confirmed many of the findings that were previously obtained with hybridization approaches but, in addition, provided higher sensitivity and specificity. “A larger study will additionally help us find the molecular mechanisms that are behind the changes in splicing,” explains Dr. Cairns.

Previous articleEnter the GEN Ten Survey for Your Chance to Win an iPad Mini!
Next articleCE Mark for Rapid Acute Kidney Injury Test