June 15, 2011 (Vol. 31, No. 12)

Josh P. Roberts

Emerging Solutions Aim to Bring Challenging Targets Under Control

Some proteins are easier to express than others. For some, just clone them into an E. coli vector and let the bacteria do the work. But for others—such as toxic proteins, membrane proteins, glycosylated proteins, and hydrophobic proteins—getting them to express, fold, and function or crystalize can present a challenge.

Scientists at CHI’s “PEGS” conference, held last month in Boston, were keen to talk about some of the difficulties they faced in expressing challenging proteins. Fortunately, they were just as keen to talk about the solutions they came up with.

Typically, when researchers want to express a protein they turn first to E. coli: transfect the bacteria with the appropriate vector, grow it up, plate it out, select colonies, and grow up cultures of the selected clones. Then the fun of purification begins.

Yet for work on an analytical scale, protein can be generated without the need for cell culture (though this, alas, doesn’t alleviate the subsequent need for biochemistry). Several in vitro translation systems based on bacterial, yeast, plant, insect, and mammalian cell extracts exist to turn DNA or RNA into protein.

These all have the added advantage of being able to express proteins that may be toxic to a cell. Each, of course, has its own pros and cons as well—bacterial- and wheat germ-based systems have higher yields, but these and the rabbit reticulocyte systems aren’t capable of glycosylating the resultant proteins, for example, while insect cell lysates introduce insect-specific post-translational modifications.

There are advantages to expressing human proteins using a human system, foremost among them are that the products will be properly folded and properly post-translationally modified, explained Brian Webb, platform manager at Thermo Fisher Scientific. Many academics have published on making a cell-free protein-expression system from human cell lysates, but to date there is only a single commercial product line based on research from the Riken Institute in Japan and licensed by Thermo.

Thermo’s kits make use of a T7 promoter and an internal ribosome entry site, so that it is not necessary for the resultant RNA to be capped before being translated. One set of kits, based on HeLa cell extract, has been optimized for high yield—currently promising up to about 40 ug/mL in about 90 minutes. Data was presented at “PEGS” indicating that yields of several hundred µg of recombinant protein per mL of reaction are possible, and Webb said that such higher-producing kits should be available in the fall. A second set, based on a hybridoma, is optimized for the expression of glycoproteins.

Multiple proteins can be expressed, potentially allowing protein complexes that have more than one subunit in the same reaction, which could “allow those subunits to form and carry out their function,” Webb added.

Later this year Thermo will introduce a series of expression vectors that include C- or N-terminal flag, HA, or myc tags, to complement the currently available HIS tag vector. Fusion vectors encoding GFP are also in the making.


According to Thermo Fisher Scientific scientists, there are advantages to expressing human proteins using a human system, foremost among them are that the products will be properly folded and properly post-translationally modified.

Dare to Compare

For production quantities, cell culture/fermentation is still the way to go. And, although biopharmaceuticals have been on the market for more than 20 years, those products have been produced by only a handful of expression systems, said Georg Klima, Ph.D., head of microbial process science at Boehringer Ingelheim Biopharmaceuticals.

Not every system can handle every protein. Dr. Klima was working on a dimeric Fab antibody fragment that he suspected would be difficult to express—the target it binds to was quite hydrophobic, and looking at the Fab’s variable regions indicated it might tend to aggregate. He decided to do a side-by-side comparison of different systems.

In one arm of the test, light and heavy chains were expressed separately with good titers as inclusion bodies and could be refolded separately, but could not be assembled together, nor could they be refolded together. He found no expression when attempting to express the Fab in the E. coli periplasm. And similarly, no expression was seen when trying to get the methylotrophic yeast Pishia pastoris to secrete the Fab into the supernatant.

Only the Pseudomonas fluorescent expression platform, developed by Pfenex, yielded a properly folded, biologically active dimer. “That confirmed that we had a difficult-to-express molecule,” he noted.

Pfenex can create over 1,000 individual strains. This particular project combined 20 different plasmids with certain genetic elements on them with 50 different host strains also with different properties such as protease deletion. These were screened on the 96-well plate scale, and those strains yielding the highest active Fab:target binding titer by bio-layer interferometry (BLI) analysis were chosen for fermentation scouting.

For this part of the study, 24-unit, 4 mL bioreactors allowed a broad range of induction conditions to be compared. The best strains and conditions from this stage were then moved into a conventional 1 L reactor—“which lets you estimate what will happen in a production bioreactor,” he noted. The process from strain construction and screening to fermentation confirmation, scaleup, and purification took about 10 weeks.

The resulting material had high levels of purity, excellent solubility, and high affinity. “Overall it was a successful study for us,” Dr. Klima remarked. While a traditional E. coli fermentation is still his first choice for protein expression, Dr. Klima thinks that the Pfenex system will be very useful with other hard-to-express proteins.

Chaperone Required

Proteins often have hydrophobic patches that can initiate a hydrophobic collapse during their synthesis, making them difficult to express in heterologous systems. When such proteins are expressed in E. coli they tend to form inclusion bodies, and various tricks need to be employed. Fusion tags are often attached to proteins to help them solubilize, for example, and cleaved off during downstream processing. Yet even the best fusion partners don’t always guarantee that the now soluble proteins are properly folded.

Pradman Qasba, Ph.D., chief of the structural glycobiology section at the National Cancer Institute, studies human glycosyltransferases. While the researchers have expressed and folded some of these family members in vitro, that method did not work for others. For example, Drosophila β-1,4-Galactosyltransferase-7 (β4Gal-T7) inclusion bodies could be folded in vitro but not human β4Gal-T7. The maltose binding protein (MBP)- β4Gal-T7 fusion was expressed successfully in E. coli, yet it showed poor solubility, and the β4Gal-T7 aggregated after proteolytic release from its MBP partner.

Perhaps, Dr. Qasba reasoned, a different carbohydrate binding protein might succeed where MBP failed. “Sugar binding is similar to certain protein-protein interactions; sugars have hydrophobic surfaces, and hydrophobic surfaces bind with the hydrophobic cavity of the binding site in the lectins.” He hypothesized that the cavity in the sugar binding site of the lectins may be able to bind to and stabilize proteins with hydrophobic patches and prevent their collapse as they are folding.

Galectins are known to specifically bind β-galactoside sugars. Dr. Qasba used human galectin-1 as a fusion protein to express soluble folded human β4Gal-T7 in E. coli. The fusion protein was captured on a lactose column, eluted, and cleaved. It was then selected by binding to a UDP column. “If it binds to that it is really folded—otherwise it would not be able to catch hold of that particular donor substrate,” he said. “This is the first time anyone has shown that galactins can act as a chaperone.”

To date, Dr. Qasba’s lab has successfully expressed all three of the galectin-glycosyltransferases fusion proteins they have attempted. The vector has also been shared with several other labs in the U.S. and Europe. He isn’t betting that it will be the universal answer to the folding problem. But because galactin-1 is only one of a family of 17 related small proteins, it allows for many possibilities. “If one doesn’t work, perhaps another will.”

Make It Glow

Many proteins of therapeutic interest are membrane bound. By definition these mostly hydrophobic molecules are normally surrounded by a lipid bilayer. Obtaining sufficient quantities of purified, properly folded protein from solubilized cellular membranes can present a great challenge.

When Genentech structural biology scientist Christopher Koth’s group is presented with a new protein target, the protein must be expressed to sufficient levels, purified, and then its high-resolution structure determined through crystallization or NMR.

The process usually involves generating a large number of constructs. This is because they typically want to determine the minimal active fragment of the protein suitable for structure studies or test mutations that may stabilize the purified protein. Often, the native full-length proteins simply won’t express. And they have many targets to evaluate.

“We run into this problem of having an early bottleneck, where we try to screen and identify those constructs that are most likely to yield us high-resolution crystal structures, or be amenable to structural studies by NMR.”

What if they could determine the suitability of an expressed protein for structural studies before purification? Perhaps it could be made visible in the context of all the other proteins in a cell—like looking for a bright yellow four leaf clover in a field of green clover—and there was a measure of how amenable it might be to those structural studies?

Fusing expressed proteins to a fluorescent protein, like GFP, is one possible solution. However, Genentech uses a standard HIS tag, and Koth didn’t want to change any of the cloning or expression pipeline. So, he and chemist Zachary Sweeney used a fluorophore that attaches to the HIS tag, as a result they were able to visualize the proteins as they came through a size exclusion column (SEC).

Experience has shown that proteins that give a good SEC chromatography profile may crystallize. “On the other hand, proteins that give a lousy profile almost never crystallize,” he explained. “Multiple peaks, asymmetric peaks, or peaks in the wrong spot—if we see that, we drop the construct and don’t proceed to purification.”

Koth uses a silica-based size-exclusion resin that can be run at very high flow rates. “We can do one run every approximately 20 minutes, so overnight we can screen a very large number of constructs,” he explained.

For some challenging targets, like membrane proteins, maybe only 10–20 constructs from a 96-well plate will behave well by the SEC test. While that may not seem like such a magic bullet assay—especially since there are no guarantees that they will indeed crystallize—it represents “a significant time savings for us,” Koth noted. It means “we can screen more constructs, and that increases our chances of coming up with something that’s going to give a high-resolution structure.”


Genentech researchers are working to determine the suitability of an expressed protein for structural studies before purification.

Tips for Obtaining Integral Membrane Proteins

James C. Samuelson, Ph.D., senior scientist at New England BioLabs, is investigating protein fusion strategies and novel E. coli host strains for efficient expression of heterologous membrane proteins. He shares six tips for obtaining integral membrane proteins:

  • If using a bacterial host, the standard T7 system [BL21(DE3) strain plus T7 vector] will likely be too robust. Moderating expression with a tunable T7 expression system can facilitate proper membrane integration. Commercially available tunable T7 expression strains include BL21-AI, Single step KRX, Tuner(DE3), and Lemo21(DE3).
  • Test several inducer concentrations. Expression analysis may be aided by fusing the membrane protein to a reporter protein (e.g., GFP).
  • Use a low to medium copy vector with kanamycin or chloramphenicol selection. High-copy number plasmids (e.g., pUC origins) will almost certainly result in loss of plasmid by a fraction of the cells within the culture.
  • At the point of induction, analyze the cell population for maintenance of the expression construct (by plating cells with and without antibiotic).
  • Always determine how much of the protein is in the membrane fraction and extractable by detergent before scaling up.
  • Inducing at an early stage (OD600 = 0.4) and expression at low temperature (20°C) for a longer time (16 hours or more) generally results in a higher yield of membrane-integrated protein.
Previous articleScientists Find Ovarian Cancer Cells Barge Through Mesothelium to Invade New Organs
Next articleAre Early Clinical Successes Enough to Bring RNAi Back from the Brink?