Despite recent advances in our understanding of the genome, there are still large gaps in our knowledge of the basic biology involved in gene expression. A particularly important question remains: what exactly causes a cell to make a protein?
The mechanisms thought to be involved are similar across all types of organism, including humans, and involve hundreds of regulatory proteins with multiple targets. We could improve our understanding of the mechanisms of transcription by identifying all of the different regulatory proteins' targets. Bacteria are ideal model systems for such studies because of the way they rapidly adapt to, and survive, changes in their environment.
Traditionally, transcription factor actions have been investigated by simply working through the genes one by one, but techniques in genomics, proteomics, and transcriptomics now make it possible to look at all targets simultaneously.
Transcriptomics involves comparing all the RNA transcripts made in a cell that contains the transcription factor of interest, with those made in a cell that does not have this factor. However, such studies are not very satisfactory and tend to result in extremely long lists of findings that are difficult to interpret.
Moreover, this approach is not able to distinguish between direct and indirect or downstream effects of the transcription factor, which are likely to control other transcription factors. Transcriptomics effectively examines the consequences of having a given transcription factor in a cell, rather than determining which genes the transcription factor binds to and regulates.
Of the hundreds of transcription factors found within a given cell, a small number of master regulators are known to control expression of the majority of genes. Until recently, the obvious approach to investigating regulation of transcription involved comparing transcripts made in cells that contain different master regulators.
However, each of these master regulators binds to hundreds of targets within a given cell so, even with the arrival of transcriptomics, the method is not only highly laborious but cannot distinguish between direct and indirect effects of the master regulators.
An alternative method has now been used to investigate transcription factors and obtain more conclusive results. An established procedurechromatin immunoprecipitation (ChIP) assayshas been combined in a modern twist with microarray techniques to study all of the physical binding locations of a transcription factor on a living cell's DNA at the moment of the experiment, with high precision.
This ChIP-on-chip, or ChIP-chip, technique has great potential to provide much of the fine molecular detail that determines how a protein, upon binding to its target, may affect gene expression.
This article describes a novel combination of ChIP assays with bespoke, high density microarrays to locate master regulators' binding sites in bacterial cells with high precision.
ChIP is a well-established and reliable method used to identify the DNA binding site of a protein of interest. Cells are grown and lysed followed by fragmentation of the genomic DNA. Any protein attached to the genomic DNA is cross-linked using formaldehyde, extracted, and then fragmented by sonication so that the average DNA fragment is 500 bp.
Antibodies directed against the protein of interest are then used to select protein cross-linked DNA fragments by immunoprecipitation, and these may then be identified, traditionally using PCR-based techniques.
Immunoprecipitates generated with antibodies against a specific transcription factor will contain DNA fragments corresponding to all of the DNA targets to which the transcription factor is bound at the moment of cross-linking. Therefore, ChIP assays can be used to investigate the range of different binding sites of proteins throughout genomes as well as reporting binding of proteins to specific chromosomal targets.
The new method described here uses microarray technology, developed by OGT Services (Oxford, U.K.), to analyze the full range of DNA fragments in the immunoprecipitates. OGT has developed high density, customized oligonucleotide microarrays that look at the distribution of the regulators in the cell with high precision and in a cost-effective way.
The array's design allows the analysis of small amounts of immunoprecipitated DNA without amplification and owes its increased sensitivity to the use of large probe size, 60mer oligonucleotides.
OGT fabricates the oligonucleotide microarrays using ink jet in situ synthesis, which allows DNA to be synthesized on a substrate with accuracy and precision. In the experiment described here, we used the Escherichia coli strain MG1655 and a melR-deleted derivative as an example to look more closely at this technique.
Locating the DNA Site
The E. coli melAB genes encode proteins that are necessary for transport and metabolism of the disaccharide, melibiose. Expression of this protein is dependent on the transcription activator, MelR, which is encoded by the adjacent melR gene.
Previous studies have shown that transcription from the melAB promoter is activated by MelR and have focused on using biochemistry to understand the mechanism of activation.
OGT's probe design pipeline was used to fabricate the E. coli genomic DNA 22,000 feature microarray. The technology optimizes selection for base composition and minimal homology within the E. coli genome. Probes were selected with an average spacing of one probe per 230 base pairs throughout the MG1655 genome and were all 60 bases in length.
The position of the melR binding site (located from bases 4339356 to 4339373 of the E. coli genome) has been previously characterized using Affymetrix ChIP microarrays1. This method required the use of PCR amplification, which we believe could potentially cause a bias, particularly with more complex mixtures of ChIP DNA where other transcription factors, e.g., global regulators, may be present that bind to more than one site in the genome.
In this study, ChIP was carried out without PCR amplification1. Briefly, in vivo cross-linking of bacterial nucleoprotein was initiated by adding 1% formaldehyde to cultures and, after 20 minutes, cross-linking was quenched by adding 0.5 M glycine. Cells were harvested by centrifugation before washing and resuspending in lysis buffer at 37C for 30 minutes.
Following lysis, immunoprecipitation buffer was added and cellular DNA was sheared by sonication to an average size of 5001,000 bp. Samples were incubated with protein A/G beads and immunoprecipitated using antibodies to MelR; immunoprecipitation experiments without antibodies were also conducted as negative controls. Immunoprecipitated complexes were removed from the beads and samples were decross-linked.
The ChIP DNA was labeled by Klenow random priming, incorporating Cy3-dCTP or Cy5-dCTP (GE Healthcare). ChIP DNA extracted from wild type strain MG1655 was labeled with Cy3, while ChIP DNA isolated from a melR deleted strain was labeled with Cy5.
The labeled DNA was then spun through an Autoseq column (GE Healthcare). Hybridization was carried out in an Agilent microarray chamber using OGT buffer at 55C over 60 hours. For data collection, the arrays were scanned on an Agilent microarray scanner and signal quantitation of the spots was carried out using version 7.1 of the Image Analysis and Feature extraction software from Agilent; the background was subtracted using local background.
The Cy5/Cy3 intensity ratio was calculated for each spot and plotted against the corresponding position on the E. coli MG1655 chromosome. Data were visualized using an in-house developed ChIP browser, allowing the microarray data generated to be viewed according to its relative gene position (Figure 1).
Several probes on the arrays corresponding to an E. coli genomic position between 4338913 and 4339725 had a Cy3/Cy5 ratio significantly above background (Figure 2), indicating that MelR binds to this region of the genomic DNA, which is the position of the known E. coli MelR binding site.
This study demonstrates that analyzing ChIP assays with customized high density microarrays (ChIP-chip or ChIP-on-chip) can successfully monitor the location of MelR, an E. coli transcription activator.
We chose to use OGT's high density arrays because each array has 22,000 probes representing sections of the bacterial genome that are just 100200 base pairs apart, enabling the entire genome to be covered on a single slide. This meant that the investigation of the MelR transcription factor binding sites had a much higher precision than was previously possible and opens the way for master transcription regulator sites to be easily identified.
The increased sensitivity of the array is due to the use of a large probe size, 60mer oligonucleotides2, allowing the analysis of small amounts of immunoprecipitated DNA without the need for amplification. Development of an in-house ChIP browser enables the microarray data generated to be viewed in relation to their relative gene positions, reducing the time taken to analyze the data.
The ChIP assay has been widely used to study interactions between eukaryotic proteins and their DNA targets for a number of years, but its use in bacteria has been limited until recently. Here we have shown that ChIP analysis can be extended to study the global distribution of a transcription factor; ChIP-on-chip provides a direct method to catalogue binding targets independently of their consequences on gene expression.
Amazingly, there are still 1,700 genes in E. coli whose functions are unknown, even though this is probably the best-characterized species of bacteria. Of these unidentified genes, 100 are estimated to be transcription factors.
Currently, ChIP-on-chip studies are being extended to examine multiple different strains of E. coli and should vastly improve our understanding of why some strains are so much more harmful than others; the studies may also reveal potential therapeutic or bactericidal targets against these strains.