Protein barcodes are short, information- rich stretches of amino acids encoded by DNA sequences that can be added to the DNA sequences coding for proteins. Then, when protein barcode DNA and protein DNA are co-expressed together, desired proteins can be selected for protein engineering, mRNA translation, therapeutic delivery mechanisms, and many other protein-screening applications (Figure 1). Upon protein selection, distinct barcodes associated with each expressed protein can be directly read and identified with single-molecule resolution via Next-Generation Protein Sequencing™ (NGPS) on the Quantum-Si Platinum® instrument.
This approach has the potential to transform proteomics research and drug discovery similar to how DNA barcodes have advanced the field of genomics. Importantly, NGPS offers a significant advantage over the use of mass spectrometry for decoding protein barcodes. Unlike mass spectrometry, NGPS involves workflows that are simple, user-friendly, and readily performed on a benchtop instrument. NGPS allows peptides to be distinguished based on recognition of specific amino acids rather than mass/charge ratio.
Realizing the potential of multiplexed protein characterization and screening
The combination of protein barcodes and NGPS enables robust, multiplexed functional protein screening and characterization for a number of applications, including engineering proteins, screening mRNA vaccine candidates, tracking protein subcellular localization, and studying protein-protein interactions. Table 1 highlights just a few of the possible applications and the advantages of this method over conventional approaches that can be cumbersome to perform and labor-intensive.
Detecting protein barcodes with high confidence
To demonstrate that a group of peptides had the required kinetic and sequence properties to enable use as a barcode set with NGPS, protein barcodes predicted to be highly sequence-able were empirically validated using synthetic peptides. The sequencing coverage for each individual peptide along with the kinetic signature plots for eight peptides are shown in Figure 2.
Kinetic signatures from each peptide were mapped against the entire set of peptides. Each peptide was aligned to the correct peptide sequence with a maximum false discovery rate (FDR) of 0.2%. These results highlight the strength of NGPS on the Platinum instrument in generating distinct pulsing patterns, known as recognition segments, with characteristic kinetic properties that enable detection of each peptide barcode with high confidence.
In practical applications, multiple protein barcodes will be translated by an expression system and cleaved before sequencing. To demonstrate barcode translation, barcodes were recombinantly expressed in Escherichia coli and enriched using affinity and cleavage tags. These barcodes were then mixed at different ratios (Figure 3). An eight-peptide barcode mixture library in which each barcode was added at the indicated relative amounts was prepared and sequenced. The barcodes were identified in the mixture using NGPS, with the number of alignments decreasing with relative abundance.
Seven of the eight peptides in the mix yielded FDR values of less than 10%. Of these seven, six yielded FDR values of less than 1%, and one had an FDR of about 5%. This result demonstrates that peptide barcodes can be used in recombinant expression systems and facilitate the relative quantitation of protein variants.
Each peptide barcode was added to the shelves having the same amino acid sequence. Sequencing the barcodes on the Platinum NGPS platform allows rapid identification of which mRNA had the highest expression, resulting in the most abundant protein. These sequencing results can also be used to determine the effectiveness of different lipid nanoparticle delivery systems (Figure 5).
Asking and answering new questions
Combining protein barcodes with NGPS on the Platinum instrument enables multiplexed protein characterization, variant screening, and many other applications with unprecedented speed and throughput. In addition to accelerating workflows, processing more samples in a single assay reduces costs and labor requirements. This highly versatile method will allow scientists to ask and answer new questions and promises to advance our understanding of health and disease.
John Vieceli, PhD, is Chief Product Officer and Mathivanan Chinnaraj, PhD, is senior staff scientist at Quantum-Si. They invite readers to visit the Quantum-Si website and download an application note (Protein Barcodes for Next-Generation Protein Sequencing™) describing the criteria for peptide barcode design, methods for generating peptide barcode libraries, and practical examples of how peptide barcodes can be used for screening proteins with desired properties.