An international research team has generated the first truly complete sequence of a human Y chromosome, the final human chromosome to be fully sequenced. The new sequence, which fills in gaps across more than 50% of the Y chromosome’s length, uncovers important genomic features with implications for fertility, such as factors in sperm production. The study was led by the Telomere-to-Telomere (T2T) Consortium.
This work is published in Nature in the paper, “The complete sequence of a human Y chromosome.”
The Y chromosome is unusually repetitive, making it particularly difficult to sequence and assemble; it contains a complex repeat structure that includes long palindromes, tandem repeats, and segmental duplications. This paper reveals the complete 62,460,029-base-pair sequence of a human Y chromosome. The T2T-Y, the authors noted, corrects multiple errors in the previous reference (GRCh38-Y) and adds over 30 million base pairs of sequence to the reference.
“The biggest surprise was how organized the repeats are,” said Adam Phillippy, PhD, senior investigator at NHGRI and leader of the consortium. “We didn’t know what exactly made up the missing sequence. It could have been very chaotic, but instead, nearly half of the chromosome is made of alternating blocks of two specific repeating sequences known as satellite DNA. It makes a beautiful, quilt-like pattern.”
The complete Y chromosome sequence also reveals important features of medically relevant regions. One such section of the Y chromosome is the azoospermia factor region, a stretch of DNA containing several genes known to be involved in sperm production. With the newly completed sequence, the researchers studied the structure of a set of inverted repeats or “palindromes” in the azoospermia factor region.
“This structure is very important because occasionally these palindromes can create loops of DNA,” said Arang Rhie, PhD, NHGRI staff scientist. “Sometimes, these loops accidentally get cut off and create deletions in the genome.”
Deletions in the azoospermia factor region are known to disrupt sperm production, and thus these palindromes could influence fertility. With a complete Y chromosome sequence, researchers can now more precisely analyze these deletions and their effects on sperm production.
The team revealed the structures of sperm-regulating gene families and discovered 41 additional genes in the Y chromosome. They also unveiled the structures of genes thought to play significant roles in the growth and functioning of the male reproductive system.
The researchers focused on TSPY, another gene thought to be involved in sperm production. Copies of TSPY are organized in the second-largest gene array in the human genome. Like other repetitive regions, repeating genes are challenging to analyze, so while TSPY was known to exist as many repeating copies, the specific DNA sequence and organization of this array were previously unknown. As the researchers analyzed this region, they found that different individuals contained between 10 and 40 copies of TSPY.
“We completed the wiring diagram for all these genetic switches that get activated via the Y chromosome, many of which are critical to the genetic contributions to male development,” said Michael Schatz, PhD, professor in computer science, biology, and oncology at Johns Hopkins. “We are at a point where scientists can start using this map. We were previously blind to different parts of the genome and different mutations, but now that we can see the whole genome, we hope we can add new insights to the genetics of a lot of different diseases.”
In addition to the complete Y chromosome sequence, the Human Genome Structural Variation Consortium reported the sequence of 43 diverse human Y chromosomes in the same issue of Nature. These advances complement the gapless human genome sequence released by the T2T Consortium in 2022, as well as the “pangenome” released in May of 2023 by the Human Pangenome Reference Consortium.