Experimentation is the engine of innovation. Whether optimizing manufacturing processes, testing new materials, or simulating policy outcomes, the ability to run controlled experiments is essential.

Design of experiments (DOE) is a well-established statistical methodology that helps organizations systematically explore the relationships between variables and outcomes. However, traditional DOE has its limitations, especially when real-world data is scarce, expensive or sensitive. This is where synthetic data can be used to transform how we experiment, simulate and innovate.

What exactly is DOE?

Think of DOE as a structured approach to planning, conducting and analyzing experiments. Rather than testing one factor at a time, DOE enables simultaneous variation of multiple variables, allowing teams to uncover not only which inputs matter but also how they interact.

It’s widely applied across various sectors, including manufacturing, pharmaceuticals and the public sector, to support R&D efforts, optimize processes, enhance product quality and reduce costs.

Traditional DOE versus DOE with synthetic data

While traditional DOE remains valuable, it typically relies on real-world data collected through physical experiments or historical records. This approach introduces several challenges, including:

  • Experiments can be expensive and time-consuming.
  • Data may be incomplete, biased or unavailable.
  • Ethical or regulatory constraints may limit data collection.
  • Limited ability to simulate rare or extreme scenarios.

By contrast, DOE, powered by synthetic data, overcomes these limitations. Synthetic data is artificially generated to reflect the statistical properties of real data and can be used to:

  • Generate large, diverse datasets that reflect real-world complexity.
  • Simulate edge cases and rare events.
  • Preserve privacy and support regulatory compliance.
  • Accelerate experimentation without requiring physical trials.

Synthetic data is a game-changer for companies implementing AI solutions, especially in sectors with strict privacy regulations like health care and finance. Kathy Lange, Research Director, AI Software, IDC

From vision to innovation: A patented approach

The fusion of synthetic data with DOE isn't just conceptual, it’s patent-pending.

At SAS, we’ve developed a novel framework for integrating deep learning with DOE to simulate broader design spaces using both historical and synthetic data. This patented method addresses real-world challenges, such as the inability to test all combinations physically or to access balanced datasets.

The core innovation lies in dynamically generating synthetic data that meets specific experimental needs, improving efficiency, reducing cost and expanding analytical reach. The framework enables:

  • Synthetic augmentation of sparse experimental data to improve statistical power.
  • Deep learning models are trained to simulate response surfaces across complex design spaces.
  • Adaptive DOE algorithms that refine themselves in real time as new synthetic scenarios are analyzed.

This approach is especially impactful in industries like semiconductors, energy storage and precision manufacturing, where physical testing is costly and variable interactions are highly nonlinear.

By embedding AI into the experimental cycle, SAS empowers organizations to move quickly and confidently from idea to insight.

Applying DOE to a real-world example

Heat-Assisted Magnetic Recording (HAMR) is a next-generation data storage technology that uses localized heating to increase recording density on hard drives. It’s a leap forward for the industry, but it comes with a tough engineering puzzle.

To work reliably, HAMR requires precise control over the recording head's thermal profile. Too much heat in the wrong spot can destabilize the magnetic layer. Too little, and the density gains disappear. Engineers must also maintain magnetic stability, reduce thermal-induced stress and ensure consistent performance at high areal densities.

Traditionally, engineers can run physical experiments to test different combinations of materials, laser powers and cooling mechanisms. These tests are costly, time-consuming, and often insufficient for modeling rare failure modes or understanding complex interactions.

Where synthetic data proves valuable

In the video above, you’ll see how engineers generated synthetic datasets that simulate the thermal behavior of HAMR systems under a wide range of conditions. These datasets were statistically representative of real-world measurements but included edge cases that would be difficult or impossible to capture physically.

These datasets were used to augment limited physical data, significantly enhancing model training and stability. Advanced predictive models built on this synthetically enhanced dataset demonstrated a 15% improvement in the overall desirability score – a metric that balances competing performance objectives such as thermal margin, write fidelity and device lifespan.

This approach also revealed true variable importance and identified more accurate optimal set points through response surface optimization – insights that traditional DOE methods would likely miss. This means faster innovation cycles, lower testing costs and improved product reliability.

The future of experimentation

DOE remains a powerful methodology for structured experimentation, but its potential is exponentially greater when paired with synthetic data. Synthetic data is unlocking a new frontier of innovation across industries by enabling faster, safer and more comprehensive experimentation, one where engineers and scientists can explore possibilities that were once too costly, risky, or time-consuming to attempt. The result is better experiments, better products, faster.

Learn more about synthetic data at SAS


Nate Cox also contributed to this blog post

Nate Cox
Senior Solutions Architect, SAS

Nate has worked for SAS for nearly a decade, assisting customers with the technical use and adoption of the analytics lifecycle. He has supported a wide range of business verticals, including high-tech, aerospace and manufacturing. These verticals have cultivated a desire to apply sophisticated statistical techniques to traditional processes for maximum efficiency.

Share

About Author

Nassim Rahimi

Principal Industry Consultant and Solutions Architect

Nassim Rahimi has vast expertise in semiconductors, optoelectronics, photonics and manufacturing analytics. She leads strategic initiatives across the product lifecycle – from design and R&D to manufacturing and testing. With more than a decade of experience helping global companies innovate, Nassim applies data science and simulation to solve complex engineering challenges and accelerate digital transformation.

Leave A Reply

Back to Top