The results of a large biobank study by Mount Sinai researchers could one day help clinicians better assess the true risk of disease associated with pathogenic variants. The study, headed by Ron Do, PhD, associate professor of genetics and genomic sciences and a member of The Charles Bronfman Institute for Personalized Medicine at Icahn Mount Sinai, analyzed the DNA sequences and electronic health records data of thousands of individuals stored in two biobanks.
Overall, they discovered that the chance a pathogenic genetic variant may actually cause a disease is relatively low—about seven percent. Nonetheless, they also found that some variants, such as those associated with breast cancer, are linked to a wide range of risks for disease. The results, published in JAMA (“Population-Based Penetrance of Deleterious Clinical Variants”) could alter the way the risks associated with these variants are reported, and potentially help in the future guide the way physicians interpret genetic testing results.
Imagine getting a positive result on a genetic test. The doctor tells you that you have a pathogenic genetic variant, effectively a DNA sequence that is known to raise the chances for getting a disease like breast cancer or diabetes. What exactly are those chances—10 percent? 50 percent? Or even 100 percent? Currently, that is not an easy question to answer.
Over the past 20 years scientists have discovered hundreds of thousands of variants that could cause a variety of diseases. However, due to the nature of these discoveries, it has been difficult to estimate—or provide statistics on—the true risk of this happening for each gene variant. So far, most estimates have been based on studies involving a small number of subjects, who were either part of a family that had a history of having a disease or were recruited at disease-specific clinics. But studies like these that do not use randomly chosen large populations may produce overestimates of the risk posed by variants.
And as the authors reported, while identification of pathogenic variants in disease predisposition genes, including 73 genes recommended by the American College of Medical Genetics & Genomics (ACMG73), informs clinical diagnosis and actions, “this genotype-first approach in medicine is feasible only if pathogenicity is known … A database of genetic variation, ClinVar, classifies variant pathogenicity (eg, pathogenic, benign),” they continued. “However, most variants have uncertain clinical significance and misclassified variants have inflated pathogenicity.”
For their newly reported study, researchers at the Icahn School of Medicine at Mount Sinai set out to evaluate the population-based disease risk of clinical variants in known disease predisposition genes. “A major goal of this study was to produce helpful, advanced statistics which quantitatively assess the impact that known disease-causing genetic variants may have on an individual’s risk to disease,” said Do.
The team searched large-scale DNA sequencing data of 72,434 individuals in two major biobanks for 37,780 known variants, and then scanned each individual’s health records for a corresponding disease diagnosis. The extensive search involved 29,039 participants in Mount Sinai’s BioMe® Biobank program and 43,395 participants who were part of the UK Biobank.
The study was led by Iain S. Forrest, an MD-PhD candidate in Do’s lab, who found inspiration from prior clinical experience he had as part of a post-baccalaureate fellowship at the National Institutes of Health. “The idea for the study came out of a brainstorming session,” said Forrest. “Dr. Do and I discussed the need to have a better system for classifying disease risk. Currently, variants are categorized by broad labels such as ‘pathogenic’ or ‘benign.’ As I learned in the clinic, there’s a lot of grey area with these labels. That’s when we realized that the biobanks which link DNA sequence data to electronic health records are an unparalleled opportunity to address this need.”
Initial results from the team’s analysis showed that 157 diseases in their data set could be linked to 5,360 variants that were defined as either “pathogenic,” by ClinVar—which is an NIH-supported public library—or “loss-of-function” as predicted by bioinformatic algorithms. On average, the penetrance—or chance that a variant was linked to a disease diagnosis— was low, specifically 6.9%. Likewise, the average risk difference, which describes the increase in disease risk for an individual who has the variant over an individual who does not have it, was also low.
“In this comprehensive assessment of variant penetrance in 72,434 individuals from two biobanks, most pathogenic variants in the population had low disease risk, consistent with overestimated disease risk for variants reported as pathogenic,” the authors wrote. “In these biobanks, the estimated penetrance of pathogenic/loss-of-function variants varied, but was generally associated with a small increase in the risk of disease.”
“At first I was quite surprised by the results. The risks we discovered were lower than I expected,” said Do. “These results raise questions about how we should be classifying the risks of these variants.” Alongside these findings, the study did also indicate that the risks associated with some genetic variants remained high. For example, pathogenic variants of the breast cancer genes BRCA1 and BRCA2 both averaged 38% penetrance, with individual variants falling between zero and 100%.
Further results demonstrated other advantages of using biobank data. In one example, the researchers were able to calculate the risks of individual variants that are associated with age-related disorders, such as some forms of type 2 diabetes, breast, and prostate cancers. On average, the penetrance of these variants was about 10% for individuals over 70 years of age whereas it was about 8% for those who were older than 20 years.
The team also found that the presence of some variants could depend on an individual’s ethnicity, and they identified more than 100 variants that are specifically found in individuals of non-European descent. The authors acknowledged several potential ways the study itself could have under- or overestimated the risks reported.
Nevertheless, Do said, “While more research is needed to be done, we feel that this study is a good first step towards eventually providing doctors and patients with the accurate and nuanced information they need to make more precise diagnoses.” The authors concluded, “These findings raise the question of whether variants reported as pathogenic but empirically shown to have low penetrance should be classified differently or whether categorical systems of disease risk (pathogenic vs nonpathogenic) should be complemented with a quantitative system of penetrance … Further research of population-based penetrance is needed to refine variant interpretation and clinical evaluation of individuals with these variant alleles.”