By combining two different computational strategies, scientists have found that genes once thought bereft of coding potential actually give rise to small proteins. Already, hundreds of small proteins have been found to arise from long noncoding RNAs. Moreover, it appears that the proteins are functionally active.
The computational strategies involve bioinformatics tools called the ORFScore and micPDP. The ORFScore was developed in the laboratory of Antonio Giraldez, Ph.D., a professor at Yale University School of Medicine. This tool leverages the periodicity of ribosome movement on mRNA to define actively translated open reading frames (ORFs) by ribosome profiling. micPDP was developed by Nikolaus Rajewsky, Ph.D., and his colleagues at the Max-Delbrück Center and the Berlin Institute for Medical Systems Biology. This tool is a computational pipeline that identifies evolutionarily conserved micropeptides under negative selection across species.
“micPDP revealed that the RNAs identified by ribosome profiling correspond to peptides that have been conserved over the course of evolution,” said Dr. Giraldez. “This strongly suggests that these genes encode proteins that have specific functions in these animals.”
The details of this work were described April 4 in The EMBO Journal, in an article entitled “Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation.” In this paper, the authors assert that “the combination of ORFscore and micPDP enable high-confidence prediction of many, small translated ORFs that were functionally not appreciated or previously annotated as large intergenic noncoding RNAs (lincRNAs).”
As noted by the authors, short peptides have emerged as important regulators of development and physiology, but their identification has been limited by their size. Still, examples of functionally active small peptides produced by long noncoding RNAs have started emerging in the scientific literature. The current study goes even further. In fact, it has significantly expanded the set of micropeptide-encoding vertebrate genes. “We have identified hundreds of open reading frames in the long noncoding RNAs of humans and zebrafish that may give rise to functional proteins.”
Until recently, long noncoding RNAs were thought to be restricted to the more mundane but nonetheless important structural roles that are essential to support the function of the cell. “We think the main reason that these small functional peptides have been missed in earlier studies is due to the assumptions that have to be made when assigning functions to large numbers of genes,” said Dr. Rajewsky, “Short open reading frames are so numerous that by design standard genome annotation methods have to filter out short open reading frames.”
“The peptide predictions reported in these studies are tantalizing, but this is just the first step. Things should get really interesting as the community explores the functions of the predicted peptides in vivo,” says Stephen M. Cohen, Ph.D., a professor at the Institute of Molecular and Cell Biology in Singapore who is not an author of the paper. “I imagine that we'll be hearing a lot about this new peptide world in the years to come.”