Send to printer »

Feature Articles : Jul 1, 2011 (Vol. 31, No. 13)

DNA Cloning Expands into New Horizons

Approaches Now Extend Implications of Technology Beyond Manipulation of Genes
  • Richard A. Stein, M.D., Ph.D.

One of the most indispensable techniques in all areas of the life sciences is the use of expression vectors. Cloning a gene between two restriction sites, and the expression of a protein, are almost always taken for granted—but a frequently overlooked detail is that sometimes, as a promoter is placed upstream of a multiple cloning site, several base pairs, or even restriction sites, could be appended to the 5´-untranslated region of the gene of interest. These sequences are known to influence both prokaryotic and eukaryotic translation efficiency.

Hal S. Alper, Ph.D., assistant professor in the department of chemical engineering at the University of Texas at Austin, and colleagues, recently utilized a theoretical framework to predict the effect that a multiple cloning site has on translation efficiency, and the investigators subsequently applied this concept to redesign these sites and reduce variability associated with restriction site choice. “This is the first time that a performance-based analysis of multiple cloning sites has been conducted in yeast expression systems,” says Dr. Alper.

One of the most important ideas emerging from this work, recently published in Nucleic Acids Research, is that nucleotides placed between the promoter and the gene start site during cloning may shape translation efficiency, and this is a frequently overlooked aspect when using standardized expression vectors.

Moreover, the impact of the nucleotides is dependent on the promoter used, indicating an interaction in parts. “The idea in synthetic biology is building function from parts, and in some respects, the interactions between these individually characterized parts are often very context specific,” emphasizes Dr. Alper.

In addition to ensuring the successful expression of a protein of interest, this concept has even more profound implications. When the same gene is cloned between two different restriction sites within the same expression vector, the resulting phenotypes might be different because protein-expression levels are different.

“This concept, therefore, becomes important not only to optimize the expression of proteins from genes cloned into different restriction sites, but also for the ability to conduct genotype-phenotype associations,” explains Dr. Alper. Some of the new multiple cloning sites developed in this work are devoid of this site-to-site variation, which makes them functional and flexible multiple cloning sites for a variety of applications.

An issue of considerable concern in molecular biology is the manipulation of DNA fragments that are toxic or unstable, yet their successful cloning is essential to understand their biological function. “We recently developed a new system that enabled us to clone and express toxic genes that previously were unclonable in other expression systems,” says David Mead, Ph.D., president and CEO of Lucigen.

The expression vector pJAZZ is maintained as a linear plasmid. In addition, it contains transcriptional terminators on each side of the insert to minimize interference between the cloned genes and vector sequences.

Most notable of these unclonable sequences are repetitive DNA and highly AT-rich regions. Repetitive DNA is difficult to clone because repeats often recombine out of the vector. “In a circular vector, we have some evidence as to why repetitive regions are deleted. We believe that the torsional stress that exists in a circular plasmid causes the repetitive region to recombine out during replication and transcription,” explains Ronald Godiska, Ph.D., senior scientist at Lucigen.

This torsional stress is missing in the linear pJAZZ plasmid, and this ensures the stability of the repetitive DNA regions. “This is the most striking utility of the linear vector,” explains Dr. Godiska, lead author of a recent study that describes the research applications of this new cloning system. Illustrating the superiority of this system for applications that involve AT-rich regions, which are also unstable, Dr. Godiska and colleagues described the cloning and sequencing of the 69% AT-rich 3.1 Mb genome of the Flavobacterium columnare Gram-negative bacterium.

Another advantage of the linear cloning vector is that it eliminates the size bias. When a DNA fragment is incorporated into a circular vector, two ligation reactions occur—an intermolecular ligation that joins an end of the insert to the vector, followed by an intramolecular ligation that circularizes the molecule.

The efficiency of circularization decreases as the sizes of the vector or insert increase. In linear vectors, the cloning involves two independent ligation events, so this size bias is eliminated. “This represents another major advantage of the pJAZZ vector, and it allowed us to easily clone DNA fragments up to 20 kb without size bias,” explains Dr. Godiska.

“One of the things that we are trying to find out is whether we could use shorter homology regions between DNA fragments that undergo homologous recombination,” says Kumaran Narayanan, Ph.D., senior lecturer at the Monash University Sunway Campus, Malaysia, and adjunct assistant professor in the department of genetics and genomic sciences at Mount Sinai School of Medicine.

Homologous recombination in E. coli usually requires 40 to 50 base pairs of homology between the recombining DNA molecules to be detectable. “There is some evidence that 20 to 30 base pairs might be sufficient, and this would allow investigators to design shorter primers for recombination, facilitating more flexible changes to DNA and is more economical,” explains Dr. Narayanan.

Dr. Narayanan and colleagues recently optimized an E. coli homologous recombination system for the in vivo modification of DNA substrates, and further perfected it to allow the cloning of repetitive DNA regions, a form of DNA known to be inherently unstable in a recombination environment. This recombination protocol uses a 10–20 minute induction time to minimize DNA exposure to the recombination enzyme, and is ideal for functional studies that involve highly repetitive regions, trinucleotide repeats, and homologous genes.

An additional effort in Dr. Narayanan’s group focuses on generating linear chromosome vectors that have emerged as promising tools for gene delivery into cells. Dr. Narayanan and colleagues accomplished this by capping the ends of linearized DNA fragments with telomeres derived from bacteriophage N15 to provide protection from nuclease degradation and to enable DNA replication as a linear plasmid.

The investigators subsequently exemplified the strength of this technique by generating a linear 100 kb BAC that expressed the human β-globin gene in a human host cells.

“This approach can deliver intact chromosomal loci containing their natural regulatory elements into mammalian cells, allowing temporal and spatial gene delivery from an artificial chromosome and could potentially allow us to introduce whole segments of chromosomes into cells,” says Dr. Narayanan.

Understanding Viral Pathogenesis

As recent years revealed, new pathogens regularly emerge in the human population, and viral infections such as SARS and the swine-origin H1N1 influenza are still vividly remembered for their medical, social, and economic impact. Understanding viral pathogenesis represents an important task for years to come.

While viruses have relatively few genes, polymorphisms and sequence variations are abundant among different viruses and often have pathological significance. Thus, gaining insight into the viral genomes and capturing sequence variations represent challenging but important tasks with medical and public health significance.

“We used the Gateway® technology [Life Technologies], which is a fantastic system to express proteins in different expression vectors,” explains Pierre-Olivier Vidalain, Ph.D., project leader at the laboratory of viral genomics and vaccinations at the Pasteur Institute. One challenge with respect to viral sequences is that many viruses were first isolated in the 50s or 60s and have been propagated in the laboratory since then.

As a result, the sequence from a pathogenic strain might not always correspond to the wild-type virus any longer, and in some instances virulence in certain animal models might have been lost. These small sequence variations are important to understand viral biology and host-pathogen interaction.

“Another significant challenge that we sometimes have been facing is that several of these viruses generate quasi-species,” explains Dr. Vidalain. This means that when the genomes of hundreds of different viral particles are sequenced, each of them will contain a slightly different sequence, with specific mutations. Cloning the coding sequence of such a virus sometimes requires many different isolates to be sequenced, until the one corresponding to the average sequence from GenBank is obtained.

“We have generated a database to manage this collection of viral sequences.” Dr. Vidalain and colleagues recently developed ViralORFeome 1.0, an open-access database with integrated bioinformatics tools that allows viral ORFs to be cloned by using the Gateway technology and makes them available for reverse proteomics experiments. This platform is particularly suitable to keep track of the diversity and sequence variation among viral strains.

Partly as a result of the enormous diversity and high mutagenesis rate of viruses, each sequence obtained from a clinical sample may be different from sequences found in GenBank. “I think that this is one aspect where we are original and different from similar projects conducted in the past in terms of mass cloning of yeast, humans, or C. elegans sequences.”

While the main objective in Dr. Vidalain’s group is to study the viral proteins individually, this database has many other potential uses. One of them is the ability to perform comparisons, across the same virus family, for a specific protein that has subtle variations, and learn about the function.

For example, viruses that are closely related to Chikungunya, such as the O’nyong’nyong, Sindbis, and Venezuelan equine encephalitis viruses, induce quite different diseases, but the molecular mechanisms that explain these differences are still not fully understood. “We believe that although these viruses have closely related proteins, the functions of these proteins are sufficiently different so that the disease, at the end, is different,” says Dr. Vidalain.

Dr. Vidalain and colleagues developed the idea of comparative interactomics, a strategy that can be used to compare different viruses—for example, vaccine strains with wild-type strains, or oncogenic strains with nononcogenic ones.

“We think that by comparing the function, and in particular the ability to interact with cellular proteins, we could explain, at least in part, differences in pathogenesis,” reveals Dr. Vidalain. For example, research in Dr. Vidalain’s group revealed that both the type I interferon and the NFκBs signaling pathways are inhibited by the nsP2 protein from the Chikungunya and Semliki Forest viruses, but this was not the case for a laboratory-adapted Sindbis virus strain.

Virulence Factors

Furthermore, the investigators found that the V protein from mumps, measles, and Nipah viruses blocked IFN-β signaling, but this effect was not observed for the Tioman virus, suggesting that this virus did not adapt to humans yet. “Small variations from one species to another are sufficient to change pathogenicity,” explains Dr. Vidalain.

“The major topic that we have been working on, over the past few years, is the search for genes encoding virulence factors from the Burkholderia cepacia complex,” says Jorge H. Leitão, Ph.D., assistant professor at Instituto Superior Técnico from Lisbon, Portugal. The Burkholderia cepacia complex encompasses a group of over 17 closely related bacterial species that have emerged, in recent decades, as important opportunistic pathogens, particularly among individuals suffering from cystic fibrosis and in immunocompromised patients.

Dr. Leitão and colleagues have used a random mutagenesis strategy based on plasposons, which are a class of transposons that facilitate the recovery and identification of interrupted genes. This approach relies on the generation of mutant libraries, followed by the selection of mutants that are attenuated for virulence. The phenotype is subsequently tested in a Caenorhabditis elegans model of infection.

By using this strategy, researchers in Dr. Leitão’s group recently identified several new virulence factors, including a gene cluster involved in the bacterial exopolysaccharide biosynthesis and a pleiotropic regulator involved in stress resistance and virulence.

An important finding was that this pleiotropic regulator, Pbr, appeared to be involved in biofilm formation and in shaping the fitness of the pathogen under adverse environmental condition, a finding that promises to unveil previously unknown facets of the host-pathogen interaction. More recently, this strategy led the researchers to the identification of two RNA chaperones, Hfq and Hfq2, which are believed to be major players in post-transcriptional regulation among bacteria of the Burkholderia cepacia complex.

These and many other examples reveal that developing new cloning strategies, and optimizing the existing ones, are emerging as increasingly important facets of applications such as constructing unbiased libraries, elucidating the biology of toxic or unstable genes, and understanding host-pathogen interaction.

As a technology with implications that extend far beyond the mere manipulation of genes and gene fragments, cloning assumes an active role in the process of experimental inquiry and becomes instrumental in addressing some of the most intriguing and challenging scientific questions.