I often use the SAS/IML language for simulating data with certain known properties. In fact, I'm writing a book called Simulating Data with SAS. When I simulate repeated measurements (sometimes called replicated data), I often want to generate an ID variable that identifies which measurement is associated with which subject
Author
No matter what statistical programming language you use, be careful of testing for an exact value of a floating-point number. This is known in the world of numerical analysis as "10.0 times 0.1 is hardly ever 1.0" (Kernighan and Plauger, 1974, The Elements of Programming Style). There are many examples
A reader wrote for help with a computational problem. He has a vector of length N and the vector contains integer values in the range [1, 120], which represent months for which events occurred over a 10-year period. The question is: what is the 24-month period for which the most
This is a third post on newspaper stories that I recently read. Today's post deals with science, politics, and rising sea levels. Incidentally, the title is a blatant reference to John Allen Paulos's brilliant book, A Mathematician Reads the Newspaper. Senate approves law that challenges sea-level science The NC legislature
This is my second post on some newspaper articles that I recently read. Today's post deals with academic fraud. Questions linger in academic fraud case Over the past year, the News and Observer has occasionally reported on a scandal at the University of North Carolina at Chapel Hill in which
This past weekend was Father's Day, so I took some time to relax and read the newspaper. I found several stories that suggested interesting statistical questions. Unfortunately, the data are not available for analysis. Nevertheless, the stories are worth sharing. Over the next few days, I'll post my thoughts on
To celebrate special occasions like Father's Day, I like to relax with a cup of coffee and read the newspaper. When I looked at the weather page, I was astonished by the seeming uniformity of temperatures across the contiguous US. The weather map in my newspaper was almost entirely yellow
A collegue who works with time series sent me the following code snippet. He said that the calculation was overflowing and wanted to know if this was a bug in SAS: data A(drop=m); call streaminit(12345); m = 2; x = 0; do i = 1 to 5000; x = m*x
"Help! My simulation is taking too long to run! How can I make it go faster?" I frequently talk with statistical programmers who claim that their "simulations are too slow" (by which they mean, "they take too long"). They suspect that their program is inefficient, but they aren't sure why.
I recently read a blog post in which a SAS user had to rename a bunch of variables named A1, A2,..., A10, such as are contained in the following data set: /* generate data with variables A1-A10 */ data A; array A[10] A1-A10 (1); do i = 1 to 10;