There’s more to RNA analysis than determining nucleotide sequences. There’s also the detection of post-transcriptional modifications (PTxMs). More than 170 PTxMs are known, and they seem to affect various biological processes—translation and decoding, gene expression control, bacterial antibiotic resistance, immunomodulation, development, and human diseases. Unfortunately, PTxMs haven’t been easy to identify or quantify.
To facilitate the study of PTxMs, scientists at Scripps Research have developed an open-source software tool called Pytheas. It automates the analysis of RNA data from tandem mass spectrometry experiments. According to the Scripps team, Pytheas is a reliable tool for analysis of the most abundant PTxMs present in any cell.
Details about Pytheas appeared in Nature Communications, in an article titled, “Pytheas: a software package for the automated analysis of RNA sequences and modifications via tandem mass spectrometry.”
“The main features of Pytheas are flexible handling of isotope labeling and RNA modifications, with false discovery rate statistical validation based on sequence decoys,” the article’s authors wrote. “We demonstrate bottom-up mass spectrometry characterization of diverse RNA sequences, with broad applications in the biology of stable RNAs, and quality control of RNA therapeutics and mRNA vaccines.”
In this article, the researchers showed that Pytheas can be used to swiftly identify and quantify modified RNA molecules like those in the current Pfizer and Moderna COVID-19 mRNA vaccines.
“The analysis of RNA data from mass spectrometry has been a relatively laborious process, lacking the tools found in other areas of biological research, and so our aim with Pytheas is to bring the field into the 21st century,” said study senior author James Williamson, PhD, professor in the department of integrative structural and computational biology, and vice president of research and academic affairs at Scripps Research.
Pytheas should be useful with both natural and synthetic RNAs. Natural RNAs often have modifications that affect their functions, while the synthetic RNAs used for vaccines and RNA-based drugs are almost always modified artificially to optimize their activity and reduce side effects.
Up until now, methods for processing raw mass spectrometry data on modified RNAs have been relatively slow and manual—thus, very labor-intensive—in contrast to corresponding methods in the field of protein analysis, for example.
Williamson and his team developed Pytheas, which is based on the Python programming language, to greatly improve the automation of this processing. The app takes mass spec data on an RNA sample as the input, and outputs the predicted RNA sequences and chemical modifications, in a way that also makes it easy to quantify distinct RNAs in a sample.
The team demonstrated Pytheas’ speed, accuracy, and versatility using mass spectrometry data for important bacterial and yeast RNAs, and for SARS-CoV-2 spike protein messenger-RNAs like those used in the Pfizer and Moderna COVID-19 vaccines.
“We’re hoping that companies involved in manufacturing RNA vaccines and other RNA therapeutics will find Pytheas useful, for example in monitoring the quality of their products,” Williamson said.
The researchers are now using Pytheas in their studies of natural RNAs, and they are continuing to optimize the software.