“We want to conduct our investigations using a broad as possible approach and dig deep into any direction that we could go. There is so much variation across different tissues, as well as modifications in peptides, that this gap between biological PTMs and spectral libraries is widening with massive efforts in both realms,” he explains when describing the many advantages of searching mass spec data against spectral libraries (collections of reference spectra) instead of using the more traditional database search approach.
Dr. Bandeira is currently developing an even larger compendium of proteomics mass spec data that will house PTM data from different issues. “This MassIVE (Mass spectrometry Interactive Virtual Environment) community-wide repository we are developing will serve as a resource to converge all existing information on PTMs and mass spec data, including data from the latest types of mass spec instruments. Current databases provide limited information in that generally, these do not offer raw data associated with a mass spec report or vice versa.”
Dr. Bandeira has also expressed his hopes that researchers in the field of proteomics engage in data sharing. “Our vision in setting up this MassIVE repository at UCSD is that researchers will be able to easily access, upload their data, and even search what is available at the site. Even in its current version, our ProteoSAFe system has already enabled the analysis of over 1 billion spectra from over 3,000 users. We are aware of how important spectral data is with regard to PTMs and thus putting all this information together in one site, with users having free access to the information, will definitely enhance the analysis and understanding of PTMs,” says Dr. Bandeira.
For Marshall Bern, Ph.D., vp of Protein Metrics, screening for PTMs can be very challenging because it highly relies on prior knowledge and existing databases. “The current list of reported PTMs is now larger compared to previous years, and this is due to the rapid improvement in analytical tools in proteomics. Although this may be a major advancement in the field, this also increases the need for a reliable and rapid algorithm that understands what peptides one is trying to identify.
“With the knowledge that each amino acid occurs at multiple sites along a polypeptide chain, multiplied by the modification that marks each amino and changes its mass state, then you end up with a larger number of protein possibilities to search in a database. Since PTMs are generally characterized by mass states, if you have a longer peptide, then you will most probably have more mass states per amino acid, thus making it harder to search through a comprehensive protein resource,” says Dr. Bern.
Protein Metrics reports that it offers one of the best solutions in searching databases for protein analysis. Dr. Bern explains that their new proteomics search engine, Byonic, addresses the most common weaknesses in searching databases.
“Byonic offers a wildcard search that assists in finding matches, including specific changes within a peptide such as a PTM on the N-terminus of a chain,” says Dr. Bern. He also explained that although this approach may be slower than known-modification searches, it decreases the chance of missing a PTM by considering every possible mass shift within a range.
“The software also helps in finding matches that carry more than two modifications. Byonic can identify hundreds of times more PTMs than any existing software including N- and O-linked glycans. Byonic lets the user ask and answer more questions since it covers more types of PTMs,” he discusses.