A team of bioinformaticians at the Université de Montréal (UdeM) discovered a structural alphabet that can be used to infer the 3-D structure of RNA from sequence data.
The classical approach to RNA modeling suffers from an important limitation: It only takes into account the canonical Watson-Crick interactions A:U and G:C and excludes noncanonical Hoogsteen and sugar interactions, the UdeM team points out.
In an attempt to remedy this problem, the researchers tried to assemble the structure in silico starting from motifs that combine all possible interactions between a nucleotide and its neighbors.
The researchers first implemented the MC-Fold algorithm, which systematically assigned the different motifs to each segment of the sequence and selected the most probable pair based on its frequency in known structures. Then, the MC-Sym algorithm assembled the set of selected motifs, taking into account the constraints that are found in known structures.
“We introduced a new first-order object to represent nucleotide relationships, the nucleotide cyclic motif (NCM),” explains François Major, Ph.D., a principle investigator at the Institute for Research in Immunology and Cancer, department of computer science and operations research at UdeM. “We reasoned that using NCMs could allow us to arrive at better models of the 3-D structure of RNA molecules.
“Compared to the thermodynamic approach, our algorithms make less false positives and negatives and predict structures that are closer to the empirical data in the case of sequences for which it is available. The improvement is due to the fact that NCMs incorporate more base-pairing context-dependent information.”
The study was published in the March 6 edition of Nature.