In my article on Buffon's needle experiment, I showed a graph that converges fairly nicely and regularly to the value π, which is the value that the simulation is trying to estimate. This graph is, indeed, a typical graph, as you can verify by running the simulation yourself. However, notice
Search Results: simulation (461)
Buffon's needle experiment for estimating π is a classical example of using an experiment (or a simulation) to estimate a probability. This example is presented in many books on statistical simulation and is famous enough that Brian Ripley in his book Stochastic Simulation states that the problem is "well known
Last week I presented the GSR algorithm, a statistical model of a riffle shuffle. In the model, a deck of n cards is split into two parts according to the binomial distribution. Each piece has roughly n/2 cards. Then cards are dropped from the two stacks according to the number
If you haven't signed up for SAS Global Forum 2011 in Las Vegas, you'd better get moving: February 28 is the last day for early registration and the discounted hotel prices. You should also sign up for the pre-conference statistical tutorials, which are filling up fast! I was tempted to
When the SAS Global Forum 2020 conference was cancelled by the global COVID-19 pandemic, I felt sorry for the customers and colleagues who had spent months preparing their presentations. One presentation I especially wanted to attend was by Bucky Ransdell and Randy Tobias: "Introducing PROC SIMSYSTEM for Systematic Nonnormal Simulation".
Data science continues to be a pivotal force driving innovation across industries. From enhancing customer experiences to optimizing operational efficiencies, the role of data science is expanding, bringing with it new challenges and opportunities. This article explores the emerging trends and technologies that are shaping the future of data science
A remarkable result in probability theory is the "three-sigma rule," which is a generic name for theorems that bound the probability that a univariate random variable will appear near the center of its distribution. This article discusses the familiar three-sigma rule for the normal distribution, a less-familiar rule for unimodal
This article shows how to simulate data from a Poisson regression model, including how to account for an offset variable. If you are not familiar with how to run a Poisson regression in SAS, see the article "Poisson regression in SAS." A Poisson regression model is a specific type of
An article published in Nature has the intriguing title, "AI models collapse when trained on recursively generated data." (Shumailov, et al., 2024). The article is quite readable, but I also recommend a less technical overview of the result: "AI models fed AI-generated data quickly spew nonsense" (Gibney, 2024). The Gibney
A SAS analyst ran a linear regression model and obtained an R-square statistic for the fit. However, he wanted a confidence interval, so he posted a question to a discussion forum asking how to obtain a confidence interval for the R-square parameter. Someone suggested a formula from a textbook (Cohen,
Batch manufacturing involves producing goods in batches rather than in a continuous stream. This approach is common in industries such as pharmaceuticals, chemicals, and materials processing, where precise control over the production process is essential to ensure product quality and consistency. One critical aspect of batch manufacturing is the need to manage and understand inherent time delays that occur at various stages of the process.
A SAS statistical programmer recently asked a theoretical question about statistics. "I've read that 'p-values are uniformly distributed under the null hypothesis,'" he began, "but what does that mean in practice? Is it important?" I think data simulation is a great way to discuss the conditions for which p-values are
In a recent article, I graphed the PDF of a few Beta distributions that had a variety of skewness and kurtosis values. I thought that I had chosen the parameter values to represent a wide variety of Beta shapes. However, I was surprised to see that the distributions were all
The moment-ratio diagram is a tool that is useful when choosing a distribution that models a sample of univariate data. As I show in my book (Simulating Data with SAS, Wicklin, 2013), you first plot the skewness and kurtosis of the sample on the moment-ratio diagram to see what common
A SAS programmer wanted to simulate samples from a family of Beta(a,b) distributions for a simulation study. (Recall that a Beta random variable is bounded with values in the range [0,1].) She wanted to choose the parameters such that the skewness and kurtosis of the distributions varied over range of
Imagine a not-so-distant future where quantum computing reshapes our approach to solving some of businesses’ and society’s most pressing issues. This isn’t fantasy. Just as nuclear energy, 3D printing and gene therapy transitioned from science fiction to scientific reality, quantum computing is on the brink of becoming the next transformative
As the old saying goes, “You wait ages for a bus and then two [or possibly three]come along at once.” This saying can be updated to reflect life in our increasingly digital world: "You wait ages for a genuine disruptive technology and then two [or possibly three]arrive simultaneously." This phrase
Careers in risk management can be rewarding. The disciplines are key to a broad range of industries. Risk management teases the analytical side of the brain and there is a clear line of contribution between the work and the organization's performance. Careers in risk management are also shrouded in mystery
Rare diseases, often called orphan diseases, affect a small percentage of the population. Despite their rarity, these diseases collectively impact millions worldwide. Being a health care professional who cares deeply about overall patient care, the challenges in diagnosing and treating rare diseases resonate profoundly with me. Limited data availability, dispersed
In statistical quality control, practitioners often estimate the variability of products that are being produced in a manufacturing plant. It is important to estimate the variability as soon as possible, which means trying to obtain an estimate from a small sample. Samples of size five or less are not uncommon
I read a journal article in which a researcher used a formula for the probability density function (PDF) of the sample correlation coefficient. The formula was rather complicated, and presented with no citation, so I was curious to learn more. I found the distribution for the correlation coefficient in the
Organizations continuously search for innovative ways to optimize their operations and elevate efficiency. One promising frontier is the integration of digital twins for predictive maintenance. However, the true potential of this technology often remains untapped, with many organizations settling for what can be described as “digital shadows.” In this exploration,
The realm of augmented reality and mixed reality (AR/MR) is on the brink of a significant evolution, promising to reshape how we engage with technology. Augmented reality involves the overlay of digital information onto the real world, enriching our perception of the environment by adding virtual elements. This technology enhances
Statistical software provides methods to simulate independent random variates from continuous and discrete distributions. For example, in the SAS DATA step, you can use the RAND function to simulate variates from continuous distributions (such as the normal or lognormal distributions) or from discrete distributions (such as the Bernoulli or Poisson).
Many executives may be feeling supply chain anxiety. The pace of disruptions – weather events, aggressive marketing techniques, transportation bottlenecks, and more – remains high, and there are already signs of consumer uncertainty in 2024. To manage the unknown and keep supply chains strong, some organizations are employing intelligent, real-time
A previous article shows how to use Monte Carlo simulation to approximate the sampling distribution of the sample mean and sample median. When x ~ N(0,1) are normal data, the sample mean is also normal, and there are simple formulas for the expected value and the standard error of the
An elementary course in statistics often includes a discussion of the sampling distribution of a statistic. The canonical example is the sampling distribution of the sample mean. For samples of size n that are drawn from a normally distribution (X ~ N(μ, σ)), the sample mean is normally distributed as
The world’s largest rugby tournament returns for the knockout stages. This blog post explores how probability and simulation can be used to predict likely winners in each of the knockout stages. Team sports are dynamic, time-varying and complex topics to model. When modeling regular competitions, such as domestic leagues, it
Learn how the %FiniteHMM macro can automatically pre-process input data as well as post-process output tables for finite Hidden Markov Models (HMMs) using PROC HMM.
Could lithium, copper, nickel and magnesium become more valuable than oil and gas? The World Bank expects the demand for these materials to increase by 500% by 2050. Known as critical raw materials (CRMs), they are hard to replace and are essential in our transition to renewable energy. Solar panels