As good as the original? How AI-generated synthetic data can revolutionize health care

As you read this, someone you know may be in the hospital for an acute illness. Treatments for life-threatening illnesses are often based on a combination of existing protocols and staff experience.

And that’s great when hospitals are running smoothly and are adequately staffed.

But too often these days, hospitals are under-resourced. Despite roadblocks, the staff is overworked and under pressure to provide the best care possible.

The answer to improving patient care often relies on throwing money at the problem – a core cause for rising health care costs.

What can be done to lower the expense of patient care? One way is to use patient data to build AI models to better predict patient responses to treatments narrowing treatments to the best options for that patient.

Enter synthetic data. Think of synthetic data as artificial data generated by algorithms rather than real-world events. By creating this artificial body of data that is closely (or exactly) like the real thing, data scientists can develop digital twins – virtual models that behave like their real-world counterparts.

Synthetic data may be a lifesaver for hospitals and patients

A 2023 SAS Hackathon team from Syntho approached the problem by creating an artificial version of an original data set from a leading hospital to assess heart treatment effectiveness.

But is synthetic data as good as the real thing? That’s the question the team wanted to answer. They looked at the original data, identified the important variables and generated the synthetic data.

It turns out that the synthetic data was essentially identical to the original.

With the synthetic data secured, the team created a model to predict real-world behavior. The team created more synthetic data sets for other hospitals and combined them to generate even better predictions.

Now the team wants to include data from more hospitals and expand the number of use cases for a variety of illnesses and conditions.

Bring your ideas and join us

Interested in other innovations from the SAS Hackathon?

Come for a visit to see what teams have been able to create – from concepts to fully working solutions. And think about how you can create the next big thing in your industry by joining us in 2024.

1 Comment

Jason DiNovi on September 5, 2023 1:48 pm

This is very exciting stuff. I'm curious how well synthetic data sets will do specifically when it comes to extreme class imbalance in labeled data. This could have a big impact on so many fields where the event being predicted is typically rare - from radiology interpretation to fraud detection on the claims adjudication side. So much effort has been put into under sampling and oversampling methodologies (as well as the addition of synthetic observations) over the years to tackle this. It's very possible we see a big leap in prediction accuracy in these domains through AI generated data. Here's hoping.

Blogs