The sequence of a peptide is the most critical factor that influences its biological activity. The sequence and amino acid composition also affect the synthesis outcome and purification as well as peptide solubility. For a peptide project to be viable, it is important that it is easy to synthesize to cut costs and convenient to handle for different applications in terms of stability and solubility. Most peptides of biological interest are derived from N-terminal, C-terminal, or internal sequences of native proteins. Peptides can also be designed de novo. The following tips should be taken into consideration whenever possible in the design of a peptide.
- Sequence length: Restrict the length of the sequence to around 10–15 residues to increase the overall yield, purity, minimize impurities, and reduce cost. The purity of a synthesized peptide typically decreases as the length increases.
- Secondary structure: Certain peptides form beta sheet secondary structure. During synthesis, β-sheet formation causes incomplete solvation of the growing peptide and aggregation, resulting in a high degree of deletion sequences in the final product. To avoid such issues, design sequences which do not contain multiple and adjacent residues such as Val, Ile, Tyr, Phe, Trp, Leu, Gln, and Thr. If it is difficult to avoid stretches of these residues, make conservative replacements by inserting a Gly or Pro at every third residue, replacing Gln with Asn, or replacing Thr with Ser.
- Residues prone to oxidation: Peptides containing multiple Cys, Met, or Trp residues are prone to oxidation and side reactions which will negatively impact peptide purity and solubility. Avoid/minimize such residues in the sequence or replace with similar alternative residues. Norleucine can be used as a replacement for Met, and Ser is a less reactive replacement for Cys.
- Amino acid composition: The overall amino acid composition of a peptide is often overlooked during the design. It will impact final solubility, peptide synthesis, and purification. Keep the hydrophobic amino acid (Leu, Val, Ile, Met, Phe, Trp, etc.) content below 50% and make sure that there is at least one charged residue for every five amino acids. Replacing Ala with Gly or adding polar residues (multiple arginine and lysine, MiniPEG) to the N- or C-terminus will also improve solubility.
- Amidation and capping: For internal sequences derived from native proteins, it may be necessary to cap either or both the N- and C-termini to avoid introducing a charge where there is none in the native sequence. The C-terminus and N-terminus can be capped as an amide (peptide amide-CONH2 instead of peptide acid-COOH) and acetyl group respectively.
- Amino acid in the N-terminus: Avoid N-terminal glutamine in the sequence as it cyclizes to pyroglutamate under acidic conditions. If it is necessary in the sequence, add an acetyl group to the amino group of glutamine or use pyroglutamate instead in the sequence. Avoid N-terminal asparagine if possible to avoid difficulty during the removal of the protecting group after synthesis.
- Amino acid in the C-terminus: Amino acids such as cysteine, proline, and glycine should be avoided at the C-terminus due to various issues such as racemization, dipeptide formation, diketopiperazine formation, etc. However, synthesis methods are currently available to circumvent such problems. If there is an unusual amino acid, including D-amino acids, at the C-terminus, it is advisable to add an amide group at the C-terminus. If there is a modification at the C-terminus (e.g. biotin, fluorescein) it must be attached preferably via the side chain of a lysine.
- Problematic amino acids: Multiple numbers of amino acids such as prolines (cis/trans isomerization), aspartic acids (aspartimide formation), glycines (hydrogen bonding and gel formation) and serines (difficult coupling) should be avoided in the sequence. Also, no more than 10 amino acids should be placed after a phospho amino acid from the C-terminus.
- Ligand attachment: Attach a spacer between the peptide and the bulky ligands to minimize the influence of the ligand on the folding of the peptide.
- Solubility: During the sequence design, consider the solubility of the peptide by counting the number of charged residues in the peptide, including the uncapped N and C termini. Typically at least one charge for every five residues will improve solubility. Also make sure there are no long stretches (more than five amino acids) of uncharged residues. A short sequence with too many hydrophilic residues will cause problems during purification as it may not be retained well on the HPLC column.
In conclusion, if the above guidelines are followed during peptide design, common problems during peptide synthesis, purification, and handling can be minimized.