Leading the Way in Life Science Technologies

GEN Exclusives

More »

Feature Articles

More »
Aug 1, 2013 (Vol. 33, No. 14)

Diving Deep with Array CGH

  • Evolutionary Applications

    Click Image To Enlarge +
    TreeView representation of cDNAs that exhibit great ape or human LS aCGH signatures are presented. Order of genes within each lineage is based on the average log2 fluorescence ratios (ordered highest to lowest) of the respective species. The dataset used for this figure was not collapsed by UniGene cluster to minimize the chance that significant LS cDNAs would be missed. Fluorescence ratios are depicted using a pseudocolor scale (indicated). [University of Colorado Denver/Andrew Fortna, et al. PLoS Biol. 2004 July;2(7):e207.]

    Gene duplication is a major mechanism behind evolutionary change. With this in mind, Dr. Sikela in collaboration with Dr. Jonathan Pollack at Stanford University, focused on finding genes that have been highly duplicated along specific primate lineages in several species, including humans.

    “We used gene-based array CGH genome-wide to identify genes that are important for human and primate evolution,” says James M. Sikela, Ph.D., professor of biochemistry and molecular genetics at the University of Colorado Denver.

    This marked the first time genome-wide array CGH analysis was applied to perform a cross-species comparison of humans and other primates. “In the analysis that we performed, cDNAs corresponding to virtually all human genes had been deposited on the arrays, so this was a cDNA array, as compared to the typical oligonucleotide array. This had the advantage that we could survey the copy number of virtually all human genes in each experiment,” Dr. Sikela explains.

    This strategy revealed that approximately 140 genes were specifically changed in the human lineage, represented and were present in more or fewer copies in humans as compared with nonhuman primates.

    One of the strongest signals was from a gene that encoded multiple copies of DUF1220, a protein of unknown function. Further analysis revealed that DUF1220 was present in many more copies in humans as compared to the genome of any other primates. Subsequently, Dr. Sikela and colleagues found that, in brain, DUF1220 shows a neuron-specific preferential expression in cerebellar Purkinje cell bodies and dendrites, in neurons from the cortical layers of the hippocampus, and in the neocortex, the region responsible for higher cognitive functions.

    More recent work from Dr. Sikela’s lab unveiled strong associations between the DUF1220 copy number and brain size across primates, and showed that more copies are associated with a larger brain size, also implicating DUF1220 in certain human conditions characterized by brain-size pathology. Thus, DUF1220 copy-number variation has emerged as an important evolutionary factor and, additionally, it contributes to physiological and pathological brain size variation.

    “The DUF1220 story all began with our applying array CGH in a novel way. Array CGH certainly played a key role in revealing these findings,” says Dr. Sikela.

  • Experimental Design

    While array CGH found broad applicability in clinical and research laboratories, the statistical models of experimental design and data analysis remain an area that received relatively limited attention.

    One of the most frequent protocols, the two-color labeling approach, involves the separate labeling of a sample of interest with a reference sample using two fluorescent dyes. After the labeled samples are mixed and hybridized on the array, copy-number differences are calculated as the fluorescence ratio of the sample of interest with the reference sample. The additional observation that probe-specific dye biases may be a source of artifacts also opened the necessity to use the dye-swap design, in which each sample is separately labeled with each of the two dyes. A significant drawback of the reference design is that half of the samples are used for measurements that ultimately do not present biological interest.

    “This approach also unnecessarily increases the cost of the study,” says Jeanette E. Eckel-Passow, Ph.D., associate professor of biostatistics at the Mayo Clinic. Dr. Eckel-Passow and colleagues recently provided experimental evidence supporting the use of an off-chip reference sample to provide a more cost-effective experimental design. This allows the sample size to be doubled, a particularly attractive option for genome-wide association studies, for which the statistical power is a crucial consideration.

    “In research settings, it is really important to have as much statistical power as possible, and in order to do that, it is necessary to analyze more samples,” Dr. Eckel-Passow says. In this approach, an average of the reference sample that is labeled with both fluorescent dyes can be used for all clinical samples that are examined. “This would decrease the costs as well,” she adds.

  • Interpretation Infrastructure

    “The infrastructure that we built enables laboratories to store information on genetic variants found in genes associated with disease, along with their detailed interpretation,” says Samuel J. Aronson, executive director of IT at Partners HealthCare Center for Personalized Genetic Medicine. The center’s IT platform, GeneInsight, provides the infrastructure to manage the interpretation of genetic information used in patient care, and report the data to healthcare providers.

    Because of the dynamic nature of genetic knowledge, genetic variants of unknown significance at the time of testing may be discovered to be clinically relevant at a later point in time.

    “It is important to ensure that new information reaches treating clinicians as soon as possible after variants are reclassified. GeneInsight is designed to help laboratories meet this challenge,” Aronson explains.

    As part of the GeneInsight platform, treating clinicians receive alerts when previously identified and recorded genetic variants are updated in a manner that may be clinically meaningful. Historically, GeneInsight has focused on next-generation sequencing and single-nucleotide polymorphism data. “We are now deepening our support for structural and copy number variants,” Aronson says.

    Compared with other forms of genetic variation, characterizing copy-number variants presents different challenges, some of which stem even from the manner in which these chromosomal changes are defined and described. “Information technology is critically important to support the expanded clinical use of these data,” he adds.

    A major goal of GeneInsight is to facilitate testing processes that go beyond just capturing the state of knowledge at the moment a test is signed out. “There is no reason that clinicians should need to run a new physical test to get an updated interpretation for a previously conducted genetic assay,” Aronson says. “They should be automatically provided with an update whenever possible alerting them that our understanding of their patient’s genetic profile has evolved.”

    Copy-number variations, which occur in at least 12% of the human genome, are estimated to account collectively for more genomic diversity than all single-nucleotide polymorphisms combined. The development of array CGH unveiled a new facet of chromosomal biology, and marked a shift in understanding its link to development, health, disease, and evolution. Addressing some of the existing difficulties, including the bioinformatics, computational, and statistical challenges, will be instrumental in enabling this already established and widely used approach to undergo further development and refinement.

Related content

Be sure to take the GEN Poll

Cancer vs. Zika: What Worries You Most?

While Zika continues to garner a lot of news coverage, a Mayo Clinic survey reveals that Americans believe the country’s most significant healthcare challenge is cancer. Compared to other diseases, does the possibility of developing cancer worry you the most?

More »