English

Learn SAS
Jim Simon 0
Random Sampling: What's Efficient?

Suppose you wish to select a random sample from a large SAS dataset.  No problem. The PROC SURVEYSELECT step below randomly selects a 2 percent sample: proc surveyselect data=large out=sample method=srs /* simple random sample */ n=1000000; /* sample size */ run; Do you have a SAS/STAT license?   If not,

Internet of Things
Stuart Rose 0
Flipping the data equation

Big Data has become a technology buzzword. But how is Big Data changing insurance? Historically, insurance companies have used SMALL data to make BIG decisions. Today, insurers are using BIG data for SMALL decisions. What does this mean? Traditionally, insurance companies have aggregated data to group risks into broad categories

Data Visualization
Sanjay Matange 0
Broken Axis Redux

Often when the data includes some extreme difference in measures or some outliers, the plot of the data points can get skewed due to the need to accommodate the extreme outliers.  The bulk of the observations get squeezed into a smaller region of the plot.  While this may be useful

Rick Wicklin 0
Mathematical art: Weaving matrices

An artist friend of mine recently created a beautiful abstract image and described the process on her blog. She says that "after painting my initial square, I cut it into strips and split them down the middle, then wove them together.... I had no idea when I started piecing these

Learn SAS
Jim Simon 0
Reading Hierarchical Data - Part 3

This post is the third and final in a series that illustrates three different solutions to "flattening" hierarchical data.  Don't forget to catch up with Part 1 and Part 2. Solution 2, from my previous post, created one observation per header record, with detail data in a wide format, like

Data Management
Jim Harris 0
Data quality to "DI" for

There is a time and a place for everything, but the time and place for data quality (DQ) in data integration (DI) efforts always seems like a thing everyone’s not quite sure about. I have previously blogged about the dangers of waiting until the middle of DI to consider, or become forced

Learn SAS
Jim Simon 0
Reading hierarchical data - Part 2

This post is the second in a series that illustrates three different solutions to "flattening" hierarchical data. Solution 1, from my previous post, created one observation per header record, summarizing the detail data with a COUNT variable, like this: Summary Approach: One observation per header record   Obs Family Count

1 199 200 201 202 203 328