Malorye Allison Branca Contributing Editor

Geisinger’s MyCode Genomics Data Brings Power to Precision Medicine and Research

With it’s 150,000th patient recently enrolled, Geisinger Health System’s MyCode project is helping set the standard for big health data projects around the world. The program has a system to easily enroll patients, process both their genomic and clinical data, return relevant results to them, and spur research. It’s what many biobank developers are dreaming of.

Genomics databases now abound: there is the Million Veterans Program, the China Kadoorie Biobank Study, Vanderbilt’s BioVu, U.K.’s 100,000 Genomes Project, and many more. MyCode is unusual because of it’s rapid growth and the fact that it not only has a robust sequencing database but also more than 20 years worth of electronic health record (EHR) data. Most importantly, Geisinger is already using information gleaned from analysis of this data to guide care of its patients for almost 30 conditions, including breast and ovarian cancers, Lynch syndrome cancers, and hereditary high cholesterol.

“There’s an immense amount of information in clinical records,” said Jeremy Rotter, Ph.D., who is helping build a biobank based at UCLA. “And people are starting to learn how to efficiently mine that data, including billing records as well as doctors’ and nurses’ notes.” Rotter is director of LA BioMed’s Institute for Translational Genomics and Population Sciences, Harbor-UCLA Medical Center.

Another development that has helped MyCode and similar projects is the rise of cloud computing, which is a tool Geisinger and most biobanks use. “When you are dealing with a terabyte of data it costs a lot to store it,” explains Andrew Gruen, communications officer at Seven Bridges, which provides genomics-related data analysis and management services to multiple big data projects, including the Cancer Genomics Cloud, the Million Veteran Program, and the U.K.’s 100,000 Genomes program. Jeff Reid, Ph.D., agrees. He is executive director of Genome Informatics at the Regeneron Genetics Center (RGC), which outsources sequencing, analytics, and related services to dozens of organizations, ranging from small to large, including Geisinger. “The idea of being able to spin up as much storage as you need, as you want it is revolutionary,” he said. “It’s very different from figuring out how many computers you need.”

 “We were one of the first health systems to install an EHR, we now have an average of 14 years of data on each patient in the system,” said Andrew Faucett, one of MyCode’s principal investigators, a professor, and director of policy & education at Geisinger. Normalizing and integrating that data with the genomic data, he explains, is one of the key challenges in this field.

Any Geisinger patient can enroll in MyCode, either online or during a visit to one of the systems’ facilities. After they agree to donate a blood sample and have their EHR data mined (with privacy protection for research purposes), that sample undergoes exome sequencing.

In terms of patient health, MyCode focuses on just 76 “actionable” genes related to 27 conditions. Only the results from these particular genes are shared with patients and their doctors. But results from other genes can be used for research.

If a potentially harmful mutation is detected in an actionable gene, it is verified by an outside lab and the patient’s primary care physician (PCP) is alerted. A few days later the patient will receive a notice that there were findings from their exome analysis and an invitation to discuss these with their PCP. Patients can also meet with the Geisinger genomics group, if they would like.

The decision to return some of the results from the exom analysis is part of what Faucett calls Geisinger’s commitment to being “a learning healthcare system.” The company reports MyCode has already helped identify some patients with cancers and heart disease even before symptoms developed. Faucett said they are reporting findings to about 3.5% of patients, which is “more than we expected, and other biobanks are starting to find the same thing.” Most recently, the group reported actionable results for more than 300 patients.

Most patients do follow up. “We establish contact with 80% to 85% of individuals who have findings,” said Marc S. Williams, M.D., director of Geisinger’s Genomic Medicine Institute. Those that do not respond to the initial messages are sent a certified letter. “But the majority of people want to know about their results, and almost half are also interested in helping us do research.”

MyCode is a rich source for new genetic findings. A 2015 report published in NEJM found that particular mutations in a gene associated with cholesterol levels can be highly protective against coronary artery disease, a finding that could lead to a new treatment. Using the MyCode database, the study identified seven heterozygous carriers of an inactivating variant in NPC1L1 (R406X) who had no coronary artery disease compared to 1,001 patients, among 15,886 noncarriers had the condition.

And there is plenty of room for MyCode to grow. The project was launched in early 2014 in collaboration with the Regeneron Genetics Center, which does the sequencing. The initial aim was to recruit 100,000 participants for over five years, but it quickly surpassed that goal and reset it to 250,000. Faucett believes part of that is because patients are highly engaged with the system. Not many of them move away, and once they are connected to Geisinger they tend to stay.

Further, Geisinger serves approximately 4 million patients, about 1 to 1.5 million of whom are considered “active,” because they are seen regularly. Geisinger also recently acquired AtlantiCare in New Jersey. That acquisition will not only add more patients to the database but also increases diversity. “Our patients in Pennsylvania are about 96% of northern European decent,” explains Dr. Williams. “AtlantiCare’s patients are much more diverse.” Recruitment in New Jersey has been brisk, within just the first 8 months more than 8,000 patients signed up for MyCode.

The issue of diversity will be a growing concern for biobanks such as Geisinger’s. The largest genetic studies in many diseases have so far largely comprised people of northern European descent. “Most big studies in diabetes have been done in Caucasians,” said Dr. Rotter. LABioMed has specifically worked to make sure their biobank is diverse. “We have already found novel variants in diabetes that are specific to people of Chinese descent,” he said.

Geisinger hasn’t shared how much the system has invested in the project. Faucette said the sample collection costs about $100 per person. Re-entering has helped raise funding for the biobank and the recruitment process. “Clinical interpretation is all funded with research dollars,” he said.

One of the keys to the success of this project, said Dr. Williams, is that “we have spent a lot of time engaging and listening to the patient voice, so that we can deliver back to them what they consider valuable.” Faucett adds that “I think one reason we’ve gotten so many is that it’s open to all our patients,” he said. “There are many projects focused on one condition or another, but ours lets everyone participate.”

Dr. Reid and Gruen, who both work for companies supplying the underbelly of such efforts, see some key trends ahead for biobanks such as MyCode. “We are seeing more localities and governments diving in,” said Dr. Reid. Gruen, meanwhile, said “the market is exploding. But it is all well and good to buy a large sequencer and use it, the hump is managing and analyzing the data.”

“Large scale genetics is on the cusp of transforming medicine,” he added.   

This article was originally published in the July/August 2017 issue of Clinical OMICs. For more content like this and details on how to get a free subscription to this digital publication, go to

Previous articleCRISPR Screen Identifies Top 100 Essential Genes for Cancer Immunotherapy
Next articleSingle Protein Keeps Mouse Brains Youthful