Divide, Skim, and Conquer
Ending up with only 1% of the genome has the obvious advantage of needing to sequence and analyze that much less DNA.
David Edwards, Ph.D., professor at the University of Queensland, used different means to address the challenges of the large and complex wheat genome, which is six times larger than the human genome. “It’s 80–90% repetitive. It contains three genomes—it’s hexaploid (as opposed to humans that are diploid). You can imagine the challenge of trying to sequence something like that!” he exclaimed.
One of Dr. Edwards’ collaborators isolates individual arms of chromosomes in microgram quantities, dissecting this complex genome into manageable pieces, reducing the complexity of assembling sequence data. “We’ve sequenced both arms of chromosome 7 from each of the three genomes now, and each one is the size of a rice genome,” he said.
Another way they reduce complexity is by using a skim-based genotyping-by-sequencing method—that is, sequencing at very low density, and calling the SNP where it matches a known polymorphism on the reference genome. “The advantage is that it’s essentially dial-able, you only need a very small amount of sequence data if you’re doing trait association,” he explained. “You only need very low coverage and it’s very cheap.”
Increasing the amount of data generated, on the other hand, yields a very high density of SNP genotypes, more than 3 million SNPs on chromosome 7 alone, which allows the group to examine the reference genome itself and compare haplotype blocks. If the haplotype block breaks down, part of the genome may have been mis-assembled or rearranged.
“So we’re using this high-density skim sequencing—that’s almost a contradiction in terms—to go through and validate and to fix genome assemblies,” Dr. Edwards said. “We’ve also used it for trait mapping, which is very straightforward and provides a physical rather than a genetic location, useful to identify candidate genes.”
They looked at the relationship among the three genomes (termed A, B, and D) to determine the impact that early farmers had on bread wheat. Most genes are conserved among the three genomes, and the differential gene loss that was found supported current theories of wheat’s evolution and domestication. The A and B genomes came together about 50,000 years ago—“the gene networks that are lost relate to it being grown in the wild,” Dr. Edwards explains.
The tetraploid wheat was then domesticated and dispersed by migration up through the Middle East into what is now southern Turkey where it came into contact with D genome wheat about 10,000 years ago. He said, “this was presumably growing in the same field as the domesticated tetraploid, and that formed a hybrid hexaploid wheat that became the bread wheat we eat today. The types of genes and the gene networks that are lost are really quite different, and this tells us that the new bread wheat was under very different selective pressure.”