Researchers at the University of Edinburgh and the European Bioinformatics Institute at the European Molecular Biology Laboratory (EMBL-EBI) have meticulously developed new diagnostic genome-wide sequence analysis software that is vastly improved at evaluating genetically heterogeneous clinical presentations. Data from the use of their novel platform—released today in Nature Communications through an article titled “Flexible and scalable diagnostic filtering of genomic variants using G2P with Ensembl VEP”—shows that the approach is efficient, flexible, and scalable.
“We have developed this software to help improve access to safe, speedy, and accurate diagnosis of serious genetic disease throughout the world,” notes senior study investigator David Fitzpatrick, PhD, a professor in the Medical Research Council Institute of Genetics and Molecular Medicine at the University of Edinburgh.
This new tool can spot precise genetic changes that cause disease in the three billion base pairs of DNA code that make up the human genome. Moreover, the development will make it easier to integrate genetic testing into health care systems such as the U.K.’s National Health Service, which cares for around three million people affected by genetic diseases in the U.K.
The new software works by linking to a database of clinical information from people with genetic diseases to pinpoint DNA changes that are known to cause illness. Additionally, the program predicts the consequences of DNA changes, helping to identify disease-causing differences that are not already linked to a known condition.
“This development means researchers don’t have to go through, for example, 300 variants to identify which are relevant for that specific patient,” states lead study investigator Anja Thormann, an Ensembl software developer at the European Bioinformatics Institute. “This pipeline means they may only need to look at three or four variants.”
“We present G2P (www.ebi.ac.uk/gene2phenotype) as an online system to establish, curate, and distribute datasets for diagnostic variant filtering via association of allelic requirement and mutational consequence at a defined locus with phenotypic terms, confidence level, and evidence links,” the authors add.
G2P also scans databases of genetic information from healthy people to rule out DNA differences that look as though they may cause disease but are harmless—minimizing the risk of false diagnoses. Experts say the system is particularly useful for diagnosing disorders that may be caused by many different genes, such as severe intellectual disabilities in children.
“An extension to Ensembl Variant Effect Predictor (VEP), VEP-G2P was used to filter both disease-associated and control whole exome sequence (WES) with Developmental Disorders G2P (G2PDD; 2044 entries),” pen the researchers. “VEP-G2PDD shows a sensitivity/precision of 97.3%/33% for de novo and 81.6%/22.7% for inherited pathogenic genotypes respectively. Many of the missing genotypes are likely false-positive pathogenic assignments. The expected number and discriminative features of background genotypes are defined using control WES.”
Using genetics to diagnose diseases moved a step closer when advances in DNA sequencing technology made it affordable and possible to decode a person’s genome within a few days. The sheer volume of data produced—and shortage of expertise—has hampered efforts to analyze it and generate meaningful results. G2P, which is freely available online, will help to overcome this bottleneck and make it easier to diagnose genetic conditions in clinical practice and in research programs.
“Using only human genetic data VEP-G2P performs well compared to other freely-available diagnostic systems and future phenotypic matching capabilities should further enhance performance,” the authors conclude.