The discovery of shared biological properties among independent variants of DNA sequences offers the opportunity to broaden understanding of the biological basis of disease and identify new therapeutic targets, according to a collaboration between the Perelman School of Medicine at the University of Pennsylvania, the University of Arizona Health Sciences, Vanderbilt University, and others. The group published their findings (“Integrative Genomics Analyses Unveil Downstream Biological Effectors of Disease-Specific Polymorphisms Buried in Intergenic Regions”) in npj Genomic Medicine.
Drugs can have variable effects on people depending on small natural differences in the sequence of DNA between individuals, i.e., their single-nucleotide polymorphisms (SNPs). Many SNPs have been associated with disease risk, for instance, showing that a person with an A at a given location in the DNA sequence has a higher risk of diabetes compared with someone with a G. However, these disease-related SNPs often reside in the so-called “dark matter” of the genome that does not directly code for genes, but does include switches that control gene expression.
Over the last 10 years, researchers have conducted genome-wide association studies (GWAS) to map DNA variants across thousands of genomes from individuals to find which variants are more frequent in people with a certain disease. For such common complex diseases as diabetes or cancer, GWAS have identified hundreds such variants. On the other hand, GWAS have found that many disease-associated variants do not alter the function of genes in an obvious way, making some variants difficult for immediate clinical interpretation.
Senior author Jason H. Moore, Ph.D., the Edward Rose Professor of Informatics and director of the Institute for Biomedical Informatics, and colleagues have developed a computational method to explore the downstream effects of variants associated with risk to reveal possible mechanisms of disease.
“Our results provide a 'roadmap' of disease mechanisms emerging from GWAS to identify candidate therapeutic targets,” Dr. Moore said.
In the current paper, the team demonstrated that variants associated with disease risk can affect such biological activities as gene expression and the function of proteins in cellular house-keeping machinery.
“Taking this all together a more comprehensive picture of disease biology is emerging,” explained Dr. Moore. ” This picture, up to now, has been blurry, especially when variants occurred between genes.”
The team used computational modeling of two million pairs of disease-associated SNPs drawn from three GWAS projects, as well as information from other genome databases that match a patient's individual genetic makeup to their outward symptoms. From this, they predicted 3870 SNP pairs with a similar biological mechanism. These prioritized SNP pairs, with overlapping messenger RNA targets or similar functions, were more likely to be associated with the same disease than unrelated pathologies.
Specifically, using a subset of the prioritized SNP pairs in independent studies of Alzheimer's disease, bladder cancer, and rheumatoid arthritis patient data, they showed that two variants can contribute to disease independently, but also interact genetically. “From this we determined that the precise combination of DNA variants in a patient may synergistically increase or antagonistically decrease one's relative risk of disease,” according to Dr. Moore.
Using data sets from the Encyclopedia of DNA Elements (ENCODE)—a project designed to discover the bits of DNA that have a biological function related to a gene—the team validated that the biological mechanisms of disease shared within the prioritized SNP pairs are frequently governed by matching transcription factor binding sites and interactions of segments of chromatin that are seemingly far apart and not related.
Now, the team is refining their methods to identify those genetic risk factors that have been overlooked by ignoring the biological context of their effects on common diseases.