In this exclusive interview, Stephen Hsu (Michigan State University and co-founder of Genomic Prediction) discusses the application of polygenic risk scores (PRS) for complex traits in pre-implantation genetic screening. Interview conducted by Julianna LeMieux (GEN).

GEN: What motivated you to start Genomic Prediction?

Stephen Hsu: It has a very long history. Laurent Tellier is the CEO and we’ve known each other since 2010. We’d been working on the background science of how to use machine learning to look at lots of genomes and then learn to predict phenotypes from that information.

We were betting on the continuing decline in cost for genotyping, and it paid off because now there are millions of genotypes available for analysis. We’d always thought that one of the best and earliest applications of this would be embryo selection because we can help families have a healthy child.

GEN: How did you first get interested in genomics in general, given your educational background in physics?

Hsu: I was interested in genetics and evolution, molecular biology, since I was a kid. I grew up in the ’70s and ’80s and already at that time there was a lot of attention focused on the molecular biology revolution, recombinant DNA. We were always told physics is a very mature subject and biology is the subject of the future and it will just explode eventually with these new molecular techniques.

When I got to college and I took some classes in molecular biology, I realized that a lot of the deep questions—like how do you actually decipher a genome and figure out which pieces of the genetic code have direct consequences in phenotypes or complex traits?—would not be answerable with the technology of that time. So I put it aside and did theoretical physics, but got re-interested around the time I met Laurent. I became aware of the super exponential cost curve for genotyping, sequencing in particular. I realized, if this continues for another ten years or so, we’re going to be able to answer all these interesting questions I’ve been thinking about since I was a kid.

GEN: What kind of research do you do currently at Michigan State that assists Genomic Prediction?

Hsu: The work we do at my academic lab is applying AI and machine learning to large genomic datasets. We have a big supercomputing cluster here and people who are trained in physics with strong computational backgrounds. Our interest on the academic side is to figure out which pieces of the genome are influencing which traits and in what way.

Humans aren’t really smart enough to do that; you really need AI to do it. We publish results on traits ranging from height and bone density—those are quantitative traits, to disease risk predictions for common diseases, ranging from diabetes to breast cancer to hypothyroidism.

Once there is a sufficient amount of data—where that threshold lies depends on the details of the specific trait we’re studying—the algorithms now work really well. We’re able to build interesting predictors where, for example, you could identify the top couple percent of the population in risk score, who have, say, 5 or 10 times the risk of a typical person of getting the disease.

You validate the predictor on a different population, not the training population, and you find that the people who are in the top few percent of risk according to your predictor are maybe 5 or 10 times more likely to have the disease than a typical person. That is already crossing into the domain of clinical applicability, not just for embryos but for actual adults in regular health systems. Myriad [Genetics] is already selling a polygenic breast cancer predictor.

GEN: How do you generate a polygenic risk score for different diseases? Of the eight diseases listed on the Genomic Prediction website, are those diseases that your lab has basically generated that data for?

Hsu: Many of them were produced by my research group, but the current best-performing breast cancer predictor actually comes from a large international consortium that works on breast cancer…

We use the same data that people would use for GWAS [genome-wide association studies]. For example, we might have 200,000 controls and 20 or 30,000 cases of people in their 50s and 60s who are old enough that they would have been diagnosed for diabetes (or something) if they had it. The algorithm knows which ones are the cases and which are the controls, and it also has about 1 million SNPs from each person, typically what you get from an Affymetrix or an Illumina array.

It is a learning algorithm that tries to tune its internal model so that it best predicts whether someone is actually a case or a control. There’s a bunch of fancy math involved in this—a high-dimensional optimization. You are basically finding the model that best predicts the data.

It is different from GWAS because GWAS is very simple—you look at a particular gene or SNP and you say is there statistical evidence that this particular SNP is associated with whether you have diabetes? You get a yes/no answer. If the P value is significant enough then you say we found a hit.

That problem mathematically is very different from the problem we solved. We are actually doing an optimization in a million-dimensional space to find simultaneously all the SNPs that should be activated in our predictor. This is all in the technical weeds but it is just different mathematics…

We think we can actually predict risk by doing this high-dimensional optimization. Initially, people just thought we were crazy. We wrote theoretical papers predicting how much data would you need to be able to accurately predict height or something like that.

We wrote those papers five or six years ago, and the GWAS people did not really get it. But then we actually produced predictors that worked and people had to take notice. Now, most of these major GWAS collaborations are turning towards genetic prediction as the main activity, not just discovering the 5,000th SNP associated with the disease, but actually seeing whether you can use all 5,000 SNPs together to predict risk.

There’s been a huge spike in these papers in the last year or two and the results are really impressive, to the point where a lot of health systems are becoming aware of it and realizing that, sooner than later, they may be deploying widespread, inexpensive genetic testing to predict risk for adults…

All this has been known to animal and plant breeders for a long time. The milk you drink comes from cows that have been created through artificial insemination… The U.S. Department of Agriculture maintains a database of a million cows from which the people in that field [of animal breeding] do the same kind of high-dimensional optimization that we do…

As Laurent and I and others were laboring to lay the foundations of this kind of work in human genetics, we were always shocked that I could talk to some geneticist and they would literally have no idea what people in the cattle or chicken or corn breeding world were doing. They already had functioning predictors a few years ago.

GEN: Are any of your methods proprietary or patented? Could any clinic follow your lead?

Hsu: To some degree, anybody can, and we have published all our algorithms. All these methods are available to outside groups. Some conceptual understanding of what is happening during the high-dimensional optimization is useful to have to do the analysis properly. But the basic algorithm and code are all out there.

GEN: Does Genomic Prediction offer the PRS testing as a first level test or is it only going to couples who are concerned about aneuploidy and other rare monogenic diseases?

Hsu: Our basic pipeline uses the same biopsy that is used for aneuploidy today, and that is done about a million times a year worldwide. We designed the process so that we can accept exactly that biopsy, so that no clinic would have to change any operations.

GEN: Who is going to decide what diseases and/or traits should be included in your panel?

Hsu: Good question. For diseases, it is pretty clear cut. There is a set of guidelines from ASRM (American Society of Reproductive Medicine). They have an ethics group and guidelines. The basic issue is whether there is a serious life impact from these diseases.

If there is a reliable, effective predictor for a disease that has a significant impact on your life, we’ll probably include it. You could say, how about psoriasis or eczema, acne, are those worthy to be included? We’ve not encountered it yet because those are not the “serious” diseases that we are focused on right now.

Away from diseases, we can predict cosmetic traits really well, so we can figure out who is blonde, redhead, blue eyes; who has light skin, dark skin… But even though there is high demand for it, we don’t plan to do cosmetic stuff at Genomic Prediction. The non-disease traits, like height, I think we only warn parents if the child is in danger of idiopathic short stature, which is a medical condition.

We follow the disease model for everything. Anything that we are putting warnings in the report about are things that are medically classified as diseases. We are just informing the physician or the genetic counselor that this embryo is at elevated risk for a particular disease or normal risk. But we don’t go beyond that.

GEN: Is that something the company is going to stick with forever?

Hsu: It is a tough question because obviously it is uncharted territory. For a company, there are all kinds of drivers—the science, ethics, medical health—that would impact the decisions, so I cannot really describe exactly what decision we would make.

But my feeling is the company should not get too far out ahead of what society is comfortable with… we want there to be a broad discussion in society about what people think is appropriate.

ASRM is very clear about what they think is appropriate. The U.K. has HFEA, a government-run regulatory board that regulates genetic testing in IVF, so every country has a different regulatory scheme. The difficult question may arise if some country decides that they are comfortable with something further out on the edge—for example, cosmetics are totally allowed in the United States.  We chose not to do them.

In the future, it could be we have some huge customers, like big clinics in Korea or something and they are demanding that they want to be able to do cosmetics. They really want to know who has lighter colored skin and who has darker colored skin.

It will be a tough decision for us—suppose we have this huge customer that is ordering 100,000 tests a year from us and they really want this feature, which we can do and which is 100% legal in South Korea. What are we going to do? I’m not going to prejudge what we’re going to do but we certainly will not do things if they are not broadly acceptable in that society and obviously legal.

GEN: Your lab in New Jersey has CLIA certification but is there a regulatory body that would prevent you from offering enhancement selection for intelligence or other traits?

Hsu: Currently in the U.S., I think it is unregulated. There are the ASRM guidelines but they’re just guidelines, it is not a government body. In that sense, the U.S. might be less regulated than most other countries.

GEN: Even for common diseases like type 1 diabetes (T1D), some people would argue that the polygenic risk scores are not sound enough to be used on an individual basis—fine for population studies but not for one person. How would you respond?

Hsu: That is a basic question in statistics. You might say the fact that you got three speeding tickets last year does not imply that you’re at higher risk of an accident this year. Surely, we only can conclude that at a population level. The insurance actuary is actually concluding that three speeding tickets last year actually means they should charge you 10% more for your policy.

That is population-level analysis. But when they issue the policy and they charge you 10% more, they’re making an individual decision. That jump is made all the time. That’s what statistics are used for.

Similarly, if I’m prescribing a certain cancer drug for you and I say, “I only have population-level data that says that people with this mutation do better with drug A than B; therefore, I am giving you drug A.” Some could say, “Hey, that’s crazy, man. You’re making an individual decision for an individual patient based on population-level data. That’s insane.” I just think it is a misunderstanding of statistics to make that argument.

What you should look at is the strength of the statistical evidence. In other words, if I can take the T1D predictor and go to a population that is totally different from the training population—maybe they were born in different decades or different part of a different continent—but I find very similar predictive power in the second population as in the training population, that is very strong evidence that the thing actually works…

To give you another example, the current best breast cancer predictor is a polygenic score that uses between 500 and 1,000 SNPs. A woman who scores in the top 10% on that predictor has a 33% lifetime probability of getting breast cancer. If you’re in the top 5% in risk score, you have over 50% of lifetime risk of breast cancer.

Now, you could say, “Wait, this is population-level data. Where did you get that 50% number?” Well, I looked at a group of women that were not involved in the training set and I looked at only the ones who had the top 5% risk score and then I calculated that yeah, half of them or two-thirds of them actually had breast cancer by the end of their lives. So that is how I got that probability estimate.

GEN: On the list of conditions screened by Genomic Prediction is low cognitive ability. How can one use a PRS to predict low cognitive ability?

Hsu: There was a very large GWAS involving 1.1 million people, published in Nature Genetics. Here is an interesting statistical result: When you test their predictor, or maybe a slightly improved version of their predictor, the correlation between the predictive IQ and the actual IQ of the person is about 0.3 to 0.4. It is mathematically very similar to the SAT ability to predict your performance in college.

It is not perfect but in the same way that [a college] dean would have to have his arm twisted to admit some kid whose SAT score is way below average into the engineering college, in the same way the parents may deserve a warning if we find that an embryo has super elevated risk of intellectual disability… Intellectual disability is a well-defined medical condition and it is related to low IQ score.

GEN: Are there any scenarios under which you would screen for high IQ scores?

Hsu: We feel like society is not ready for it. I could be wrong. The Government of Singapore might come to us and say, “We want you to operate in Singapore and we want to alert parents if one of their embryos is likely to be well above average. We even took a poll and 86% of Singaporeans would do this if they were using IVF…’’

Suppose they come to us and they say, “We are a society, we’ve thought about this and we want to do it.” You could imagine a scenario where we say okay, I guess in Singapore it is okay but we do not feel like Americans are ready for it.

That is a total hypothetical. We do not know how it is going to evolve. At the moment, we do not offer it.

GEN: Another concern is that as genetic screening becomes more and more entrenched in IVF in general, the wealthy would be getting rid of diseases, whereas people of lower socioeconomic status would not. Do you think that diseases are going to become more of a burden on the lower end of the spectrum?

Hsu: Yes. I am a center left person politically so I’m not against redistribution of wealth or income or genetic resources. In Israel and in the U.K., if you have fertility problems, IVF is covered under the national healthcare plan. Those countries have made different decisions about inequality than Americans have. There, I would hope someday that Genomic Prediction is a licensed vendor for the national healthcare system of those countries and we can provide this health risk screening for people undergoing free IVF in Israel and the U.K.

It is not necessarily going to exacerbate inequality. It depends on the country and the decisions people in those countries make about their healthcare systems. In the U.S., for sure in the short run it is probably going to exacerbate, to some degree, inequality. I don’t like that… But it is a much broader discussion. You’re talking about this corner case of people who are going through IVF, meanwhile we have people with no health insurance in the United States.

GEN: Who should be involved in these discussions?

Hsu: This is a very good question. Who in society should have a voice and how should it be decided how this new technology should be deployed? One model, prevalent in countries that have a single-payer national healthcare system—they are doing this right now in the U.K.—is to appoint a commission to look at standard of care genotyping for everyone.

And the commission will hear from the public and it will have a bunch of famous professors and doctors and, you know, clergymen, ethicists, philosophers appointed. After that discussion, the committee should make some recommendations to the government and then the system, the NHS (National Health System) CEO and the health minister and others should decide whether to adopt those recommendations. I think that is how it should go.

In the U.S., I have no idea because you have a patchwork of private healthcare systems. You could get Congress to pass some laws regulating it. But the system I described, that is ongoing in Finland and the U.K. and other countries, is to me the right way to get to a rational decision on these things. I think that will happen in the next few years, both for IVF-related things and also, more broadly, use of genomics in healthcare.

This interview was edited for clarity and length. To read more about Genomic Prediction, please see the article “The Risky Business of Embryo Selection.”

Previous articleEpigenetic Therapies Return Cells to the Straight and Narrow
Next articleGPCR Design Method Could Lead to Improved Drug Target Stability