A recent study reveals that genetic information may not be as private as people have been led to believe.
Yaniv Erlich, Ph.D., and his research team at the Whitehead Institute hoped to start a conversation when they published their study in Science showing that all they needed to identify nearly 50 individuals who had submitted personal genetic material for genomic studies was just a computer, an Internet connection, and publicly accessible online resources.
That conversation has raised many of the right questions when it comes to privacy of genetic information. Answers, however, remain elusive despite the study and other recent developments in addressing the issue of genetic privacy, including publication last fall of a report by the U.S. Presidential Commission for the Study of Bioethical issues.
The commission’s report, Privacy and Progress in Whole Genome Sequencing, stressed that individual interests in privacy must be respected and secured to realize the promise of whole genome sequencing in advancing clinical care and the greater public good.
One key recommendation calls for Washington to join states in hammering out “a consistent floor of protections” ensuring security for whole-genome sequence data, rather than the limited approach of the Genetic Information Nondiscrimination Act of 2008 (GINA), which bars employment discrimination based on genetic information. The commission also urged that clear access to and permissible uses of whole genome sequence data be defined, so that research participants know what they are agreeing to when they sign on the dotted line.
As with all presidential commission reports, proof of effectiveness will be how many of its recommendations become reality. Lisa M. Lee, Ph.D., the presidential commission’s executive director, told GEN on Tuesday that her office has been meeting with the U.S. Office of Human Research Protections on informed consent.
Creating Consent Guidance
“They are beginning work on creating some guidance documents around what should be included in these kinds of consent forms,” Dr. Lee said. “These would be consent forms that regulate research. But we’re hoping that they will also apply to other situations where people have their genome sequenced, including potentially clinical situations, or other nonresearch situations.”
Washington’s next moves are highly anticipated, and will be closely watched by Dr. Erlich and others.
“What we wanted to do is to shift the discussion forward from whether genetic information can identify you and how to prevent that, to maybe to how to prevent misuse of the data,” Dr. Erlich, the study’s principal investigator and a Whitehead Fellow, told GEN on Monday. “So maybe we need to replace GINA. This is one option. Or maybe we need new rules to protect the misuse of genetic data.”
GINA and other current U.S. laws don’t go far enough to protect genetic privacy, necessitating comprehensive national rules, one longtime advocate correctly argues.
“What we lack in this country is a comprehensive genetic privacy law. We have GINA; GINA is a very important and strong first step,” Jeremy Gruber, J.D., president and executive director of the Council for Responsible Genetics, told GEN. “GINA has significant limitations in terms of its comprehensiveness, and I think we need to have an appropriate national discussion to talk about how we’re going to actually govern the access to and use of genetic information across a variety of platforms.”
Inferences from Surnames
Gruber spoke days after Dr. Erlich joined four co-authors in publishing their study, Identifying Personal Genomes by Surname Inference. The group analyzed short tandem repeats on the Y chromosomes (Y-STRs) of men whose genetic material was collected by the Center for the Study of Human Polymorphisms (CEPH), and whose genomes were sequenced and made publicly available as part of the 1000 Genomes Project.
Researchers found that a strong correlation can be made between surnames and the DNA on the Y chromosome, since both are traditionally transmitted from father to son. Erlich’s group was able, through surname inference, to discover the family names of the men by submitting their Y-STRs to publicly accessible databases maintained by genealogists and genetic genealogy companies, which store the Y-STR data by surname.
The team identified nearly 50 American men and women participants in CEPH, after validating their inferences with Internet record search engines, obituaries, genealogical websites, and public demographic data from the National Institute of General Medical Sciences (NIGMS) Human Genetic Cell Repository at New Jersey’s Coriell Institute.
Researchers concluded that the posting of genetic data from a single individual can reveal deep genealogical ties, as well as help identify distant relatives who may have no acquaintance with the person releasing genetic data. Dr. Erlich said his team focused on male genomes in the collection of Utah Residents with Northern and Western European Ancestry (CEU), since their informed consent did not definitively guarantee their privacy, and explicitly stated that they may be identifiable by future techniques.
Question of Harm
The Erlich study sparked plenty of mainstream news coverage, as well as commentary from the PHG Foundation, a genomics and health policy think tank in Cambridge, U.K. PHG noted that the researchers “had to undertake considerable effort and analysis to identify specific individuals.”
“Even if an individual can be identified from their DNA sequence, it is not clear that this will cause them harm,” observed Simon Leese, writing for PHG.
Speaking with GEN this week, Leese said the argument that identifying DNA sequence owners did not necessarily cause clear harm “was in large measure based on the fact that—except for a few rare circumstances such as highly penetrant single gene disorders or gross chromosomal abnormalities—it discloses next to nothing about an individual's current medical status.”
Leese also noted that in Dr. Erlich’s study at least some individuals made little effort to preserve their anonymity, having participated along with other family members in both research projects and publicly available online genetic genealogy databases.
An American Issue?
Philippa Brice, PHG’s communications director, told GEN sharing genetic data may be less an issue in the U.K. than in the U.S.—likely a consequence of U.K. healthcare being overseen by the state-run National Health Service. It remains to be seen, she added, whether that view will change once results emerge from the Wellcome Trust Sanger Institute’s public online survey of public concerns about genomic data. The survey is part of a larger five-year Genome Ethics study being conducted through 2015.
In the U.S., by contrast, the presidential commission spent much of its time weighing privacy concerns—understandable in a nation whose Constitution’s Fourth Amendment declares “the right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated.”
Where the commission, Dr. Erlich, and others can be most helpful is in upending the roadblock to effective privacy protection resulting from piecemeal existing laws. GINA only covers employment bias. A patient’s genome sequenced in a doctor’s office is covered by the Health Insurance Portability and Accountability Act. Sequenced in a research lab, however, that genome is subject to the “Common Rule” or Federal Policy for the Protection of Human Subjects.
Those laws, plus differences in state laws, create a crazy-quilt of rules that serves as another roadblock, leaving researchers and others unclear on how to carry out the privacy protection everyone says they want. The commission favors a single national standard, to its credit. But since the commission is only advisory, it will be up to stakeholders to spell out what form that should take—A federal law? A multi-state agreement? Another guidance?—then hammer out the substance of protections for genetic material that balance respect for privacy with advancement of research.