January 15, 2016 (Vol. 36, No. 2)
Angelo DePalma Ph.D. Writer GEN
Bioprocessing’s Primary Protein Production Factories in Line for Genome-Scale Retooling
When asked to put systems biology within a bioprocessing context, Nathan Lewis, Ph.D., responded as confidently as any scientist might, were he, like Dr. Lewis, the head of a laboratory that focuses on mammalian systems biology and CHO cell bioprocessing. Not content to serve only as an assistant professor of pediatrics at the UC San Diego Medical School, Dr. Lewis also runs the UC San Diego’s Systems Biochemistry and Cell Engineering group.
Integrating systems biology and bioprocessing, says Dr. Lewis, is about “accounting for the activities and functions of all cellular processes influencing protein production” and leveraging the knowledge “to design better production hosts and to identify bottlenecks in recombinant protein production.”
Recombinant protein production involves pathways comprising more than 1,000 genes. Chemical reactions catalyzed by dozens of enzymes are responsible for glycosylation alone; the secretory pathway uses hundreds of genes that administer post-translational modifications or aid in translation.
“You’re dealing with many complex processes,” explains Dr. Lewis. “You want to understand and control all of them.”
Three drivers have advanced microbial fermentation for at least 15 years. First, scientists have been able to reference complete genomes. Second, they have incorporated pathways into systems biology models and interaction maps. Third, they have applied gene-editing tools. All these capabilities have been exploited in work with yeast and Escherichia coli systems.
“Those three capabilities have advanced chemical production in microbes,” Dr. Lewis tells GEN. “We finally have them for CHO as well.”
Dr. Lewis and co-workers published the CHO-K1 genome in 2011. In addition, this group, in concert with collaborators, is preparing updates.
“The CHO-K1 genome provides a parts list for everything occurring inside CHO cells,” notes Dr. Lewis. “We will soon publish data on the thousands of chemical reactions that constitute all the metabolic pathways in CHO.
“These systems biology models covering metabolic and secretory pathways will eventually allow us to understand how everything functions together. We will be able to learn what governs the different protein production capabilities for protein production in cell lines and clones.”
Betwixt and Between
The importance of CHO cells to industry is precisely why it took so long to achieve these capabilities in that system. “CHO is viewed mostly as an industrial cell,” explains Nathan Price, Ph.D., associate director at the Institute for Systems Biology. “There has not been much interest in public funding for the analysis of CHO, in a way that would take a long view in terms of building out a genome-scale metabolic model and the application of genome-editing techniques.”
Biotechnology companies, moreover, are more interested in releasing products than in funding basic science. (One exception is the Novo Nordisk Foundation Center for Biosustainability, which operates out of Denmark.)
Dr. Lewis also detects hesitation by some in the bioprocess community to take systems biology seriously: “Early studies only provided large sets of genes influencing bioprocess phenotypes, often summarized by general ontology terms such as ‘metabolism’ or ‘transcription regulator.’” Engineering cells based on this knowledge is difficult.
“There could be dozens or hundreds of genes in an enriched group,” Dr. Lewis continues. “But now that we have the genome sequence, and know what many of these genes do, we can drill down to the biochemical mechanisms of these genes and identify individual genes for knockout or overexpression. Today we finally have testable hypotheses to apply to CHO cell-line engineering.”
Those updates to the CHO genome, says Kate Caves, director of business development at DNA2.0, will allow investigators to explore more deeply and build pathways from sequencing, to understand better the mechanisms involved in CHO cell protein production.
DNA2.0 takes a completely different approach than Nathan Lewis for improving protein biomanufacturing in CHO cells. The company applies sophisticated machine-learning tools, similar to those utilized by tech companies such as Google and Netflix, to massive amounts of CHO sequencing data.
“We use pattern-matching methods such as Support Vector Machine to create supervised learning algorithms to model sequence function relationships,” Caves details. According to Wikipedia, Support Vector Machines are supervised learning models that analyze data and recognize patterns.
“These models allow us to understand and predict how manipulations at the DNA sequence level affect function” Caves elaborates. “We are working on coupling genome-engineering tools, such as CRISPR, with sequence-function maps to engineer better CHO cells by manipulating the CHO apoptotic, secretory, and glycosylation pathways. This allows us to efficiently push CHO cells to become better protein factories that produce higher quality therapeutics.”
DNA2.0 uses the CHO sequence and functional and experimental data to design variants of secretory, apoptotic, and glycosylation pathways likely to improve protein production or quality.
On a more micro scale, Caves sees promise for an evolving field called vectorology, the science and engineering of expression vectors. Vectorology allows exploration of different pathway components on expression vectors toward optimal combinations for improving protein expression and quality.
DNA2.0 has developed a suite of vectors that allow exploration of multiple expression pathways with different elements, for example, promoters or secretion leaders, to uncover the optimal pathway for expressing biomolecules, including difficult targets.
“We are also applying our machine-learning tools in this space by building a service called VectorGPS,” Cave continues. “VectorGPS’ predictive algorithms make it possible to model a customer’s proprietary CHO manufacturing lines and/or conditions to build better expression pathways for therapeutic targets.”
VectorGPS is being used by DNA2.0 customers to produce new vector suites. DNA2.0 has created predictive models for relevant, commercially available cell lines such as CHO-F and CHO-S, to build better expression vectors.
“Several of our pharma clients work with different proprietary CHO lines, for which we can build custom algorithms,” Caves asserts. The company has worked with several large pharmaceutical and biotech companies, and it has collaborated with James D. Love, Ph.D., director of technology development and research assistant professor of biochemistry at the Albert Einstein College of Medicine, in designing vectors to express difficult G-protein-coupled receptors in his CHO and HEK cell lines.
The major limitation Caves notes in applying the systems approach to CHO is the instability of the CHO genome. “To model a system, we must have to be able to control the variables involved and it must have little to no noise,” she explains. “The CHO genomic profile is constantly changing. This makes them hard, but not impossible, to model.”
Dr. Price adds that CHO data is fractionated and proprietary: “We clearly need forums and formats where CHO data can be shared, so we can learn collectively about building out accurate models. That comes back to systems defined broadly, not just the biological systems but the social systems that drive the funding and coordination of the science.”
The need to relate systems biology data with process-level events such as cell culture and product quality may additionally constrain the potential impact of systems biology on biomanufacturing.