Although big data is a foundation of today’s healthcare industry, not all data analyses lead to actionable conclusions. Nonetheless, advanced analysis is crucial, especially given the increasing diversity of data and the growing volume of information from preclinical work to after-market studies.

To develop ways to make the best use of that information, Sema4, a patient-centered health intelligence company with 1,200 employees, including hundreds of biomedical researchers, data scientists, and clinicians, is focused on transforming healthcare through data-driven insights.

Sema4 collects and generates a wide range of health-related data. “We work with de-identified structured, unstructured, phenotypic, and genetic data from our four health system partners. We also generate a significant volume of genomic data ourselves through our genetic testing capabilities,” says Krish Ghosh, PhD, Sema4’s Chief Analytics Officer and General Manager of Biopharma Solutions.

However, just collecting and generating data is not enough. “It’s our mission to help find the signal within all the noise in our swaths of data,” says Aviva Beckmann, PhD, Principal Scientist at Sema4.

Although this technology can be applied to any disease, it is particularly useful for autoimmune diseases, which affect about 24 million people in the United States. One challenge in diagnosing autoimmune diseases comes from a lack of reproducible, actionable biomarkers associated with specific diseases. “There is a need to define groups of patients who are better suited to specific types of therapies, such as biologics or small-molecule inhibitors,” Beckmann explains.

Applying Sema4 Capabilities to Immune & Inflammatory Disease
Digging deep in the data

Finding clinically actionable insights in health-related information requires enormous amounts of data. “We have 46-plus petabytes of data, and that grows by more than one petabyte a month,” says Ediz Calay, PhD, Scientific Collaborations Manager. “Plus, we’ve already curated more than 12 million patients’ worth of data and more than 500,000 of these patients suffer from autoimmune and inflammatory diseases.”

Along with genomic and clinical data, Sema4’s technology analyzes electronic medical notes. “Using our automated abstraction engine and natural language processing, we can extract information from traditionally really messy information—those notes about patients,” says Phillip Comella, PhD, a research scientist at Sema4.

Among all these data, Sema4 scientists search for interactions represented by networks. These networks capture the complex molecular interactions underlying the data and can be used to better understand the cause of a disease and to develop improved treatments.

Directing data at treatments

With Sema4’s Centrellis® health information platform, biotechnology and pharmaceutical companies can use data-driven insights from drug discovery through clinical trials and even after-market studies. For example, these insights can be used to improve the standard of care.

Improving that standard of care, however, requires accurate data and analysis. “There, you cannot have an 80/20 rule because 80% accuracy is kind of fuzzy,” says Ghosh. “Our goal is to drive precision medicine–centered care by taking this dense data and pushing it to 95% accuracy or more.”

High accuracy is especially important in autoimmune diseases. With these chronic conditions, Sema4 curates longitudinal patient data. “Using that data, we can refine our models to better predict responses—not just outcomes, but also the disease onset and progression,” Beckmann explains. The results could supplement the entire drug development process by making sense out of clinical data, molecular data, and how to integrate multi-omics data. The outcome could be new treatments for autoimmune and many other diseases.

As Sema4’s work shows, creating those treatments depends on collecting large amounts of diverse data—from omics to a physician’s notes—and producing insight from that information with AI-driven analysis. As Ghosh summarizes the process: “We bring all this data together and meaningfully connect it.”


