Degenerate Codon Libraries
A more thorough sampling of individual mutations can be achieved by incorporating degeneracies into a nucleic acid sequence at specific codons. Changing a codon to, e.g., NNK (N=A/C/G/T, K=G/T), exhaustively samples all possible amino acid substitutions (and one STOP codon) at the chosen positions. Degenerate codon libraries usually have distribution biases caused by the uneven degeneracy of the genetic code.
For example, Tryptophan is encoded by a single codon, while Serine is encoded by three codons in an NNK library. For libraries in which only one or two residues are varied in each clone this may not be a serious concern. However, as the number of residues sampled increases, the bias in the library escalates. A six-codon NNK library will have 109 (412 x 26) possible nucleotide sequences (including STOP). The probability of finding 6xTrp will be equal to finding 6xSTOP and ~750-fold lower than finding 6xSer.
One popular variation on degenerate codon libraries is a set of variants in which all 19 amino acid substitutions are enumerated at every position in a protein in turn. The library is delivered as individually arrayed variants with defined sequences or as a mixed pool of all 20 possibilities for each position. This greatly reduces the screening burden; it also has advantages in patent filing since it is possible to test every amino acid mutation in a protein.