Put the Top Down
When analytes are large molecules like proteins, different techniques are called for.
Proteins are more difficult than peptides to separate by standard LC methods, and more difficult to detect by MS. Thus, proteomics traditionally starts with the digestion of proteins into 10 or more peptides, which are then separated by LC and detected by MS, explains Evert-Jan Sneekes, nano LC specialist at Dionex. The typical samples have several thousand peaks—it’s possible to detect only certain portions of these peaks—and with smart search algorithms they can identify the proteins.
Yet assume that two proteins differ only by a post-translational modification (PTM). After digestion there would be no way to tell that there were two different starting species, unless the modified peptide and its unmodified analog are detected within the mixture of several thousand peptides.
Sneekes has developed a top-down approach to doing proteomics, using a two-dimensional, capillary-scale LC method, combining size exclusion and monolithic reversed-phase LC capable of handling a wide range of sizes, to separate intact proteins. “This keeps all protein information within one molecule and does not multiply sample complexity.”
Of course, there are various technical challenges (or else everyone would be doing it) to doing large scale top-down proteomics, Sneekes explains. There are several steps, and all of them need to be optimized both individually and to work together.
It’s necessary to have a 2-D LC workflow that is easy to use, gives good performance, and is of the dimensions to interface directly with the MS. The MS needs to be able to deal with molecules on a scale of up to 100 kD or even higher. Databases need to be developed and searches optimized for proteins. “And all of these bits and pieces have to be designed to work for the samples—for the proteins in this case. That’s the major development going on now.”
The Great Unknown
Identifying small molecules can be a challenging task as well. In the manufacture of pharmaceuticals, any impurities or degradants found above a certain threshold level—as low as 0.05%, depending on the daily dose and genotoxicity—must be identified.
Arindam Roy, a senior analytical scientist at Boehringer Ingelheim Ben Venue Laboratories, specializes in structural elucidation of impurities in pharmaceuticals. “The first thing we do is chromatography on the front end, to make sure they’re separated from any other components that we are not interested in,” he says. “Then we get this into MS.”
When dealing with potentially genotoxic compounds, the FDA wants manufacturers to be able to detect and identify down to 1–20 parts per million. The low detection limits of such contaminants means utilizing the power of accurate mass MS, sometimes combined with other even more specialized techniques.
“When you are trying to elucidate structure you want to go beyond one MS—fragment your molecule and get accurate mass, again, from the fragmented parts as well.” This can be done repeatedly up to LC-MS10, but after MS4 or MS5 you lose signal, he adds.
LC-MS spectral databases are of little use in identifying compounds, because different instruments yield different qualitative spectra, “so the statistical matching is a very difficult process in the LC-MS world.” Vendors may try to see what compounds have the same accurate mass and narrow down the search that way. But when you have no knowledge of what that structure would be, “organic chemistry is your best friend,” Roy says.
“You look at the process and see what structures are possible in this process, and then you get the mass specs and you see, out of all the structures, what this is matching.” The elucidation protocol may not end there. To confirm the identity of the contaminant, Roy sometimes resorts to deuterium-exchange experiments, LC-NMR, and other exotic techniques.