For most of us, when we think of schizophrenia, our minds go back to the movie Sybil starring Sally Field and her multiple personalities. Whether Sybil had the disorder is debatable, but 1% of the world’s population diagnosed with schizophrenia suffer from hallucinations, delusion, and cognitive deficits.

“Schizophrenia is a devastating disease,” said Robert Waterland, PhD, professor of pediatrics-nutrition at Baylor College of Medicine. “Although genetic and environmental components seem to be involved in the condition, current evidence only explains a small portion of cases, suggesting that other factors, such as epigenetic, also could be important.”

Robert Waterland, PhD, professor of pediatrics-nutrition at Baylor College of Medicine is senior author on the study.

Waterland and his colleagues at the Baylor College of Medicine have now developed an innovative strategy that promises the ability for early diagnosis of schizophrenia. The in silico, computational strategy analyzes specific regions of the human genome using a machine learning algorithm called SPLS-DA.

The findings are reported in an article titled, “A machine learning case–control classifier for schizophrenia based on DNA methylation in blood,” published in the journal Translational Psychiatry. Financial support for the work came from NIH/NIDDK, the Cancer Prevention and Research Institute of Texas, and USDA/ARS.

The researchers used blood samples to profile DNA methyl (CH3) groups—a type of epigenetic marker—that differ between people diagnosed with schizophrenia and normal individuals. Based on this profile, the authors developed a model that can assess an individual’s chances of having schizophrenia.

The SPLS algorithm partially separates cases from controls, based on the two dimensions, said Waterland. Using the first two SPLS dimensions the authors calculated a vector they call “risk distance” that identifies individuals with the highest probability of schizophrenia.

Waterland, corresponding author on the current study, said, “At higher risk distance cutoffs, we can improve the accuracy of classification [of schizophrenia cases versus normal individuals], but fewer cases are classified. To compare different models, we determined the risk distance that allowed us to achieve a positive predictive value (the probability that an individual with a positive test result is actually a case) of 80%.”

Rather than altering the genetic code itself, epigenetic markers modify gene expression, for instance through the strategic placement of methyl groups on DNA that turn genes on or off in different cell types. Consequently, epigenetic markers can vary between different tissues in the same individual making it challenging to assess whether a specific epigenetic change contributes to a disease.

Waterland and his colleagues had identified a set of specific genomic regions in their earlier work in which DNA methylation differs between people but is consistent across different tissues in one person. This key finding helped the group circumvent the problem of tissue-specific differences common among epigenetic markers.

They call these genomic regions CoRSIVs for “correlated regions of systemic interindividual variation” and propose that studying CoRSIVs is a new way to uncover epigenetic causes of diseases.

“Because methylation patterns in CoRSIVs are the same in all the tissues of one individual, we can analyze them in a blood sample to infer epigenetic regulation on other parts of the body that are difficult to assess, such as the brain,” said Waterland.

“Remarkably, our final model based on CoRSIV methylation in blood correctly classified 85% patients in the independent test set. An analogous SPLS model based on PRS (polygenic risk score), by comparison, correctly classified only 32% patients in the independent test set. This is especially impressive when you consider that geneticists have been conducting GWAS (genome wide association studies) for schizophrenia for about 10 years,” said Waterland.

Many previous studies have analyzed methylation profiles in blood samples with the goal of identifying epigenetic differences between individuals with schizophrenia, the researchers explained.

Chathura Gunasekara, PhD, bioinformatics analyst in the Waterland lab at Baylor.

“Our study is innovative in various ways,” said first author Chathura Gunasekara, PhD, bioinformatics analyst in the Waterland lab at Baylor. “We focused on CoRSIVs and also applied for the first time the SPLS-DA machine learning algorithm to analyze DNA methylation. As a scientist interested in applying machine learning to medicine, our findings are very exciting. They not only suggest the possibility of predicting risk of schizophrenia early in life, but also outline a new approach that may be applicable to other diseases.”

Another novelty and point of innovation in the study is that it considers major confounding factors that earlier studies do not. For instance, methylation patterns in blood can be affected by smoking and taking antipsychotic medications, both of which are common in schizophrenia patients.

“Here, we took various approaches to evaluate whether the methylation patterns we detected at CoRSIVs were affected by medication use and smoking. We were able to rule that out,” said Waterland. “This, together with the fact that DNA methylation at CoRSIVs is established very early in life, indicates that the epigenetic differences we identified between schizophrenia patients and healthy individuals were there before the disease was diagnosed, suggesting they may contribute to the condition.”

This novel approach in predicting schizophrenia achieves a much stronger associations of epigenetic markers with schizophrenia than has ever been done before, the authors said, establishing a proof-of-principle that focusing on CoRSIVs could make epigenetic epidemiology possible.

“We believe our findings hold promise for early detection of individuals at high risk for schizophrenia, as early as at birth. In 2019, we identified nearly 10,000 CoRSIVs in the human genome. Our current results are based on only the 10% of these that are informative on the Illumina HM450 methylation array. It is not unreasonable to imagine that a screening tool based on all 10,000 CoRSIVs will provide much greater sensitivity and specificity,” said Waterland.

The researchers have already developed and validated a target-capture bisulfite sequencing approach for high-throughput analysis of targeted methylation at CoRSIV regions in the human genome. “As soon as we are able to obtain funding, we would like to perform prospective studies in humans to determine if we can accurately identify individuals with high risk for schizophrenia, years before they are diagnosed. This would allow development of targeted preventive interventions.”