Scientists at Children’s Hospital of Philadelphia (CHOP) and affiliated with the CHOP Epilepsy Neurogenetics Initiative (ENGIN) say they have combined clinical information with large-scale genomic data to successfully link characteristic presentations of childhood epilepsies with specific genetic variants. The findings, “Semantic Similarity Analysis Reveals Robust Gene-Disease Relationships in Developmental and Epileptic Encephalopathies,” appear in the American Journal of Human Genetics.
Developmental and Epileptic Encephalopathies (DEE), a group of severe brain disorders that can cause difficult-to-treat seizures, cognitive and neurological impairment, and, in some cases, early death, are known to have more than 100 underlying genetic causes. However, matching characteristic clinical features and outcomes with specific genetic mutations can be especially daunting given the large number of genetic causes, each of which is very rare.
When genetic information is collected, a person’s phenotype is typically also documented. However, while genetic information is collected in a standardized manner, the same is not true when describing clinical symptoms, which makes it difficult when trying to pinpoint whether certain genetic mutations are responsible for specific clinical features.
Building upon their previous work, researchers from CHOP utilized the Human Phenotype Ontology (HPO), which provides a standardized format to characterize a patient’s phenotypic features and allows clinical data to be used at a similar level as genetic data.
“For this study, we used phenotypic and genetic information that had been collected in several important cohorts for more than a decade,” said Ingo Helbig, MD, attending physician at ENGIN, director of the genomic and data science core of ENGIN, and lead investigator of the study. “In this study alone, we found associations of 11 genetic causes with specific phenotypes. Without methods to systematically analyze clinical data, we could not have possibly done this previously, even with this robust cohort of patients.”
In total, the study team analyzed 31,742 HPO terms in 846 patients with existing whole exome sequencing data. Some examples of causative genes in DEE identified in this study were SCN1A, which was associated with complex febrile seizures and focal clonic seizures; STXBP1, which was associated with absent speech; and SLC6A1, which was associated with EEG with generalized slow activity. In total, 41 genes with variants presented in at least two individuals, and 11 of those genes showed significant similarity between phenotypes of the patients with changes in these genes. Using a statistical analysis, the researchers showed that this was more than would be possible via chance.
“Gene-specific phenotypic signatures included associations of SCN1A with ‘complex febrile seizures’ (HP: 0011172; p = 2.1 × 10−5) and ‘focal clonic seizures’ (HP: 0002266; p = 8.9 × 10−6), STXBP1 with ‘absent speech’ (HP: 0001344; p = 1.3 × 10−11), and SLC6A1 with ‘EEG with generalized slow activity'(HP: 0010845; p = 0.018). Of 41 genes with de novo variants in two or more individuals, 11 genes showed significant phenotypic similarity, including SCN1A (n = 16, p < 0.0001), STXBP1 (n = 14, p = 0.0021), and KCNB1 (n = 6, p = 0.011). Including genetic and phenotypic data of control subjects increased phenotypic similarity for all genetic etiologies, whereas the probability of observing de novo variants decreased, emphasizing the conceptual differences between semantic similarity analysis and approaches based on the expected number of de novo events,” the investigators wrote.
“We demonstrate that HPO-based phenotype analysis captures unique profiles for distinct genetic etiologies, reflecting the breadth of the phenotypic spectrum in genetic epilepsies. Semantic similarity can be used to generate statistical evidence for disease causation analogous to the traditional approach of primarily defining disease entities through similar clinical features.”
“Traditionally, many of the genetic epilepsies that we now develop treatments for were described because of a specific set of clinical features that stood out. However, this type of traditional description of new diseases requires patients to be seen by the same provider or within the same center. What we have done with this study is re-engineered the cognitive process that goes on when clinicians discover a new syndrome,” Helbig said. “We have developed a computational mechanism to replicate this type of discovery from large, de-identified clinical data. As the amount of deep phenotypic data available to us increases, we now have the ability to identify novel genetic causes of particularly severe forms of epilepsy that are targets for new treatments.”