By sequencing and analyzing RNA, European scientists have measured gene activity among individuals whose full genome sequences had been published as part of the 1000 Genomes Project. The work effectively yields a map of the genetic basis of variation in human traits.
In a paper published September 15 in Nature, researchers from the University of Geneva (UNIGE) and eight other European institutes described the largest-ever dataset linking human genomes to gene activity at the level of RNA. The same researchers also published a companion study in Nature Biotechnology, exploring how multi-laboratory RNA sequencing projects could manage reproducibility issues.
The article on functional variation—“Transcriptome and genome sequencing uncovers functional variation in humans”—describes the sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals. In addition, it explains how the work reveals widespread genetic variation affecting the regulation of most genes.
“The richness of genetic variation that affects the regulation of most of our genes surprised us,” says study coordinator Tuuli Lappalainen, previously at UNIGE and now at Stanford University. “It is important that we figure out the general laws of how the human genome works, rather than just delving into individual genes.”
Knowing which genetic variants are responsible for differences in gene activity among individuals can give powerful clues for diagnosis, prognosis, and intervention of different diseases. Senior author Emmanouil Dermitzakis, Louis Jeantet Professor at UNIGE, emphasizes that the functional study has profound implications for genomic medicine: “Understanding the cellular effects of disease-predisposing variants helps us understand causal mechanisms of disease. This is essential for developing treatments in the future.”
In particular, as noted in the functional study, “[The] characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci.”
The functional study’s data are freely available through the ArrayExpress functional genomics archive. Open access to data and results allows independent researchers to explore and re-analyze the data in different ways.
In the reproducibility study, when the variation between laboratories was assessed, the researchers found that the main laboratory effects were seen in differences in insert size and GC content that could be adequately corrected for. This study—“Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories”—concluded that distributing RNA sequencing among different laboratories is feasible, given proper standardization and randomization procedures.