Genomes assembled from shreds and patches may fail to represent all mutations, which include not only the single-nucleotide variants and small indels so ably detected by standard short-read sequencing methods, but also mutations that add, subtract, rearrange, or otherwise reconfigure genomic structures. Genomes more resplendent may be needed—genomes that are assembled from long reads. Such genomes, say scientists based at the University of California, Irvine, may embody relatively subtle structural variations that contribute so much to complex traits.
The scientists, led by J.J. Emerson, Ph.D., assistant professor of ecology and evolutionary biology at the Ayala School of Biological Sciences, assembled the first new reference-quality genome of Drosophila melanogaster since its initial sequencing. By comparing this new genome to the existing D. melanogaster assembly, they created a high-resolution structural variant map, which they analyzed to identify previously underappreciated genetic variation.
“For the first time in animals, we have assembled a high-quality genome, permitting the discovery of all the genetic differences between two individuals within a species,” said Mahul Chakraborty, Ph.D., a postdoctoral scholar in the Emerson laboratory. “We uncovered a vast amount of hidden genetic variation during our analyses, much of which affects important traits within the common fruit fly, D. melanogaster.”
Chakraborty is the first author of a new paper (“Hidden Genetic Variation Shapes the Structure of Functional Elements in Drosophila”) that appeared December 18 in the journal Nature Genetics. The paper describes how the Emerson team assembled a reference-quality genome, which describes the D. melanogaster strain called A4, and compared the newly assembled genome to an existing D. melanogaster genome, which describes the ISO1 strain.
“We assembled the new A4 genome using high-coverage (147×) long reads through single-molecule real-time sequencing of DNA extracted from females,” the authors reported. “The A4 assembly is more contiguous than release 6 of the ISO1 strain13—which is arguably the best metazoan whole-genome sequence assembly—with 50% of the genome contained in contiguous sequences (contigs) 22.3 Mb in length or longer. As compared to the ISO1 assembly, the A4 assembly comprises far fewer sequences…while maintaining comparable completeness.
“By comparing this new genome to the existing D. melanogaster assembly, we created a structural variant map of unprecedented resolution and identified extensive genetic variation that has remained hidden until now,” the article’s authors wrote. “Many of these variants constitute candidates underlying phenotypic variation, including tandem duplications and a transposable element insertion that amplifies the expression of detoxification-related genes associated with nicotine resistance.”
Unlike standard approaches that rely on the same sequencing technology that delivered the so-called $1000 genome, the team's approach relies on single-molecule real-time sequencing from PacBio, which can reconstruct a whole genome by taking reads that represent larger pieces of the genome. The use of such long-molecule sequencing enabled Chakraborty and Emerson to unravel complex changes that alter the structure of the genome.
“This study is the first of its kind in complex organisms like the fruit fly. With this unique resource in hand, we have already characterized several candidate structural variations that show evidence for phenotypic adaptation, which can function to drive species evolution,” noted Emerson.
In exploring how some of these newly identified structural genome changes might contribute to fruit fly evolution, the group was drawn to an enzyme family that has been associated with resistance to pesticides and cold preference, among many other functions. They found that structural changes crank up the output of one of the genes by 50-fold, suggesting how such flies attain increased nicotine resistance.
According to the researchers, the fact that so much variation escaped notice in D. melanogaster—a species with relatively simple genomes less likely to hide variation—suggests that our own genomes, and those of the species we eat, are harboring an even larger store of medically and agriculturally important genetic variations.