Expanding on a concept originally coined by Walter Gilbert in 1986, Thomas Cech recently described two RNA worlds—a hypothetical, primordial world in which the same molecule combined informational and catalytic properties, and a contemporary world, forged by a spectrum of RNA-centered activities. While during the early days, following the discovery of DNA, RNA was thought to be the less interesting molecule, this view has dramatically changed in recent years.
Advances in RNA biology are reshaping concepts that have been prevailing for a long time. For example, in the field of developmental biology, decisions were traditionally envisioned mostly in terms of the progressive expression of different combinations of transcription factors that are induced by specific combinations and concentrations of growth differentiation factors.
“This is still the prevailing concept in the field, but there is much more to development than transcription factors,” says Seth Blackshaw, Ph.D., associate professor in neuroscience at the Johns Hopkins University School of Medicine.
Investigators in Dr. Blackshaw’s lab are primarily focusing on understanding the mechanisms that allow the different neuronal cell types to be specified in the adult brain during development. In a completely unbiased screen that sought to identify genes expressed at different developmental stages in the retina, Dr. Blackshaw and colleagues found several long noncoding RNAs that exhibited very prominent dynamic and cell-specific expression patterns, and their overexpression or knockdown exerted a dramatic effect on retinal development.
“We identified and studied only three long noncoding RNAs, but this is part of a bigger picture that, in terms of our notions about the genome, is in many ways comparable to the discovery of the New World,” says Dr. Blackshaw.
“There are roughly 20,000 protein-encoding genes in the vertebrate genome, and anywhere between 5,000 and 10,000 long noncoding RNAs in mice and humans, many of which are expressed in the brain, and many of which are expressed in development and, therefore, we really need to expand our notion of the real number of genes that are important molecular players.”
Understanding the biology of long noncoding RNAs is filled with challenges, and one of those is that even though a phenotype may be observed, the mechanisms of action are more elusive, making progress somewhat slow. Mechanisms are fairly straightforward for microRNAs, but more challenging for long noncoding RNAs, and it is not clear whether they work by base pairing or through completely different means.
“RNA can perform many of the same functions that proteins do. It can fold into many different shapes, it can interact with different structures, and it can even be catalytic. Almost anything that a protein can do, a long noncoding RNA can do as well, and this makes identification of the mechanisms of action of this very large set of molecules somewhat less than straightforward,” explains Dr. Blackshaw.
A recent finding from Dr. Blackshaw’s lab is the characterization of Six3OS, a long noncoding RNA that is divergently transcribed from the distal promoter of Six3, a homeodomain transcription factor important during neurodevelopment.
Dr. Blackshaw and colleagues found that Six3OS knockdown caused a dramatic phenotype overlapping with the phenotype caused by manipulating the associated protein-encoding gene, and similar defects in the development of different retinal cell types were visualized in both instances.
“We found that we could reverse the phenotype caused by Six3 overexpression or knockdown by manipulating the expression of Six3OS, and this provided conclusive evidence that these two genes interact functionally, and also suggested that the long noncoding RNA acts in a transcript-independent manner,” explains Dr. Blackshaw.
While this revealed that Six3OS regulates Six3, the mechanisms were still elusive. To address this, Dr. Blackshaw and colleagues turned to a tool that they developed in collaboration with Heng Zhu’s lab from the department of pharmacology and the Center for High-Throughput Biology at the Johns Hopkins University School of Medicine, which used Cy5-labelled Six3OS RNA to bind a human protein microarray containing approximately 70% of all annotated genes expressed as a yeast library of N-terminal GFP fusion proteins. This tool allowed the quick identification of the proteins that selectively interact with the fluorescently labeled RNA.
“We reasoned that any functionally important target of Six3OS would selectively interact with the human and the mouse homolog of the gene but not with other long noncoding RNAs,” says Dr. Blackshaw.
This completely unbiased screen identified five proteins that selectively interacted with both the human and mouse version of Six3OS, and included Eya1, a known transcriptional cofactor of Six3, along with enhancer of zeste homolog 2, a polycomb complex component that interacts with many long noncoding RNAs, and SMARCE1, involved in chromatin decondensation and transcriptional activation.
These interactions were all confirmed by immunoprecipitation in cultured human cells. This suggested that Six3OS acts not by regulating Six3 expression but, rather, that it cooperates as a specific Six3 transcriptional cofactor.
“The two are co-expressed, and the long noncoding RNA acts as a transcriptional scaffold that helps recruit transcriptional co-activators and co-repressors that allow Six3 to activate or repress its target genes, and this may be a common feature of these long noncoding RNAs, to modulate the activity of protein encoding genes that they are co-expressed with,” explains Dr. Blackshaw.
“We first used hybridization-based approaches to discover a preponderance of large noncoding RNAs, then massively parallel sequencing approaches revealed their fine grain structures, and now a combined hybridization-sequencing approach being used to unravel even higher resolution transcriptional dynamics,” says John L. Rinn, assistant professor of stem cell and regenerative biology at Harvard Medical School and senior associate member at the Broad Institute.
The past few decades witnessed several alternating periods of hybridization-based or sequencing-based strategies, which culminated with their combined use. The first genetic loci were roughly mapped through crude hybridization-based methods. Yet upon the development of reliable nucleotide sequencing methods 20 years later, a dramatic shift occurred toward sequencing-based methods. That was until the sequencing of the human genome and the prospecting to determine the number of the genes within the human genome.
The solution was the emergence of DNA microarrays, which afforded tens of thousands of sequences to be simultaneously monitored. More recently, the massive parallel sequencing revolution brought sequencing back to the center stage for most applications. However, over the past few years, an approach that involves targeted hybridization followed by sequencing has been developed, and allows much deeper investigation into specific genomic fragments, such as disease susceptibility regions.
“When we were in the array era, we could find and monitor more genes by array hybridization, and now in the sequencing era, many people have shifted focus to sequencing-based gene discovery. History has witnessed these dogma shifts between hybridization and sequencing, but it is really the integration of the two approaches where one can see even more synergistic power,” explains Dr. Rinn.
An important recent advance in the field was the development of RNA CaptureSeq through a collaborative effort between the labs of Gregory Hannon, Ph.D., and Richard McCombie, Ph.D., at the Cold Spring Harbor Laboratories, together with investigators at Roche NimbleGen.
RNA CaptureSeq involves the use of tiling arrays across genomic regions of interest, followed by the subsequent hybridization, elution, and sequencing of the corresponding cDNAs. This strategy has continued to evolve to combine the strengths of in-solution capture and exome sequencing.
Recently, Dr. Rinn and colleagues applied RNA CaptureSeq to dig into the depths of the noncoding transcriptome. The authors identified previously unknown splicing patterns and isoforms, even in genes that have been extensively studied, such as p53, and in several long noncoding RNAs. Collectively, their application of hybrid-capture technology shed new insights while realizing that the full complexity of the human transcriptome is still far from being understood.
“This approach allows us to focus on long noncoding RNAs and, after capturing them, to visualize splice variations and gene-expression regulation differences in wild-type versus tumor samples,” explains Dr. Rinn.