Many diseases have been associated with common gene mutations, or single nucleotide polymorphisms (SNPs). But how well do SNPs or SNP combinations predict disease risk? Not very well at all, say scientists based at the University of Alberta. These scientists, led by David Wishart, PhD, the study’s senior author and a professor of biological sciences and computing science, suggest that disease risk could be better predicted by evaluating clinical, metabolite, or protein measures.
“It is becoming increasingly clear,” explained Wishart, “that the risks for getting most diseases arise from your metabolism, your environment, your lifestyle, or your exposure to various kinds of nutrients, chemicals, bacteria, or viruses.”
Notice that Wishart referred to most diseases. According to Wishart and colleagues, these include many cancers, diabetes, and Alzheimer’s. In fact, for such diseases, the genetic contribution to disease risk is just 5–10%. There are diseases, however, for which the genetic contribution is about 40–50%. These diseases include Crohn’s disease, celiac disease, and macular degeneration.
Wishart and colleagues derived these findings from the largest meta-analysis ever conducted. Basically, the scientists examined two decades of data from studies that examine the relationships between SNPs and different diseases and conditions. They presented their results in a paper that appeared December 5 in the journal PLOS One, in an article titled, “Assessing the performance of genome-wide association studies for predicting disease risk.”
“One of the most commonly used approaches to assess the performance of predictive biomarkers is to determine the area under the receiver-operator characteristic curve (AUROC),” the article’s authors wrote. “We have developed an R package called G-WIZ to generate ROC curves and calculate the AUROC using summary-level GWAS data.”
G-WIZ is designed to make the most of summary-level data, which is the only genome-wide association study (GWAS) data can be deposited in public repositories such as GWAS Central. The difficulty with summary-level data, however, is that it contains only study-wide averages, p-values, odds ratios, risk allele frequencies, and other summary statistics for the entire study population.
To get around this difficulty, G-WIZ combines population modeling with logistic regression (for risk prediction) to generate study-specific ROC curves and AUROC values. The idea is to enable the estimation of SNP heritability directly from summary level GWAS data.
“[We] used the summary level GWAS data from GWAS Central to determine the ROC curves and AUROC values for 569 different GWA studies spanning 219 different conditions,” the article’s authors detailed. “Using these data, we found a small number of GWA studies with SNP-derived risk predictors that have very high AUROCs (>0.75). On the other hand, the average GWA study produces a multi-SNP risk predictor with an AUROC of 0.55.”
In general, the study showed that the links between most human diseases and genetics are shaky at best. These findings fly in the face of many modern gene testing business models, which suggest that gene testing can accurately predict someone’s risk for disease.
“Simply put, DNA is not your destiny, and SNPs are duds for disease prediction,” declared Wishart. “The bottom line is that if you want to have an accurate measure of your health, your propensity for disease, or what you can do about it, it’s better to measure your metabolites, your microbes, or your proteins—not your genes.”
“This research also highlights the need to understand our environment and the safety or quality of our food, air, and water,” he added.
Wishart and colleagues pointed out that unlike most GWA studies, clinical, proteomic, and metabolomic studies involving marker identification rarely achieve p-values of <10−6 and they infrequently report the performance of their markers in terms of odds ratios. Instead, they elaborated, most marker-based clinical, proteomic, or metabolomic studies tend to combine multiple clinical or omic measures to generate multimarker risk predictors.
“In these studies, multimarker sensitivity, specificity, ROC curves, and/or AUROCs are routinely reported,” the article’s authors explained. “One way for the different omics communities to better understand the predictive or discriminative ability of GWAS data is to convert the reported SNP data into a more conventional biomarker reporting format. In particular, if multi-SNP biomarker data could be combined and converted into ROC or AUROC data, then a more direct comparison could be performed regarding the performance of SNPs for predicting disease risks relative to clinical, metabolite or protein biomarkers for the same conditions.”