As you read this, someone you know may be in the hospital for an acute illness. Treatments for life-threatening illnesses are often based on a combination of existing protocols and staff experience.

And that’s great when hospitals are running smoothly and are adequately staffed.

But too often these days, hospitals are under-resourced. Despite roadblocks, the staff is overworked and under pressure to provide the best care possible.

The answer to improving patient care often relies on throwing money at the problem – a core cause for rising health care costs.

What can be done to lower the expense of patient care? One way is to use patient data to build AI models to better predict patient responses to treatments narrowing treatments to the best options for that patient.

Enter synthetic data. Think of synthetic data as artificial data generated by algorithms rather than real-world events. By creating this artificial body of data that is closely (or exactly) like the real thing, data scientists can develop digital twins – virtual models that behave like their real-world counterparts.

Synthetic data may be a lifesaver for hospitals and patients

A 2023 SAS Hackathon team from Syntho approached the problem by creating an artificial version of an original data set from a leading hospital to assess heart treatment effectiveness.

But is synthetic data as good as the real thing? That’s the question the team wanted to answer. They looked at the original data, identified the important variables and generated the synthetic data.

It turns out that the synthetic data was essentially identical to the original.

With the synthetic data secured, the team created a model to predict real-world behavior. The team created more synthetic data sets for other hospitals and combined them to generate even better predictions.

Now the team wants to include data from more hospitals and expand the number of use cases for a variety of illnesses and conditions.

Bring your ideas and join us

Interested in other innovations from the SAS Hackathon?

Come for a visit to see what teams have been able to create – from concepts to fully working solutions. And think about how you can create the next big thing in your industry by joining us in 2024.


About Author

Jeff Alford

Principal Editor

Jeff is a Principal Editor on the Thought Leadership, Editorial and Content team at SAS He's a former journalist with more than 30 years of experience writing on a variety of topics and industries for companies in the high-tech sector. He has a master's degree in technical and professional writing and loves helping others improve their writing chops.

1 Comment

  1. This is very exciting stuff. I'm curious how well synthetic data sets will do specifically when it comes to extreme class imbalance in labeled data. This could have a big impact on so many fields where the event being predicted is typically rare - from radiology interpretation to fraud detection on the claims adjudication side. So much effort has been put into under sampling and oversampling methodologies (as well as the addition of synthetic observations) over the years to tackle this. It's very possible we see a big leap in prediction accuracy in these domains through AI generated data. Here's hoping.

Leave A Reply

Back to Top