The reading of tea leaves won’t provide more reliable predictions if the reader simply brews more cups of tea, so as to peruse more and more leaves. A similar problem can arise in other pursuits, even genomics analysis. Granted, tasseography is far from genomics, but one may still encounter narrative bias, the tendency to find meaning where none is to be found, when one is trying to perceive mutation-disease patterns against a noisy sequencing background. Simply analyzing more data may not be a solution.
To find ways to suppress narrative bias, leading genomics researchers participated in a workshop convened by the National Human Genome Research Institute. The workshop, entitled “Implicating Sequence Variants in Human Disease,” took place September 12–13, 2012. Afterwards, many of the workshops participants continued working together. Ultimately, they produced an article, which was published April 23 in Nature, outlining how researchers and clinicians may ensure the quality of genomics data and avoid false assignments of pathogenicity, particularly in investigations of rare genetic variants, or changes detected in a person's genome.
The article, entitled “Guidelines for investigating causality of sequence variants in human disease,” examines how the flood of genome sequence data can be handled. In particular, it explains how analysts can go about confidently distinguishing between variants that seem likely to contribute disease and variants that don’t (so far as anyone is currently aware).
Recommendations in the article focus on several key areas, such as study design, gene- and variant-level implication, databases, and implications for diagnosis.
“Several of us had noticed that studies were coming out with wrong conclusions about the relationship between a specific sequence and disease, and we were extremely concerned that this would translate into inappropriate clinical decisions,” said Chris Gunter, Ph.D., one of the article’s 27 authors and associate director of research at Marcus Autism Center and associate professor of pediatrics at Emory.
Potentially, based on flawed results, physicians could order additional testing or treatments that are not truly supported by a link between a genetic variant and disease. “This paper,” added Dr. Gunter, “could help prevent such inappropriate decisions.”
The group of 27 researchers proposed two steps for claiming that a genetic variation causes disease:
- Detailed statistical analysis.
- An assessment of evidence from all sources supporting a role for the variant in that specific disease or condition.
In addition, the authors highlight priorities for research and infrastructure development, including added incentives for researchers to share genetic and clinical data.
One case cited in the paper relates to autism. Researchers found four independent variations in a gene called TTN when they compared genomes between individuals with and without autism. However, the TTN gene encodes a muscle protein (titin) that is the largest known; variations are simply more likely to be found within its boundaries compared to those of other genes. Without applying the proper statistical corrections, researchers may have falsely concluded that TTN was worthy of further investigation in autism studies.
The authors note that many DNA variants “may suggest a potentially convincing story about how the variant may influence the trait,” but few will actually have causal effects. Thus, using evidence-based guidelines such as the ones in the Nature paper will be crucial.
“We believe that these guidelines will be particularly useful to scientists and clinicians in other areas who want to do human genomic studies, and need a defined starting point for investigating genetic effects,” Dr. Gunter concluded.