Have you ever run a statistical test to determine whether data are normally distributed? If so, you have probably used Kolmogorov's D statistic. Kolmogorov's D statistic (also called the Kolmogorov-Smirnov statistic) enables you to test whether the empirical distribution of data is different than a reference distribution. The reference distribution
Search Results: sgplot (964)
You might have seen in the news that US exports of natural gas to Europe are up 300%. And we recently crossed the threshold where we export more natural gas than we import. This seems like a momentous occasion, and worthy of a graph! But first, let me make sure
I always recommend looking at data in several different ways to get a more complete mental picture. And when the data is changing over time, one great way to view it is using an animation. Follow along for some tips & tricks to animate your own data over time. I'll
Did you know that SAS provides built-in support for working with probability distributions that are finite mixtures of normal distributions? This article shows examples of using the "NormalMix" distribution in SAS and describes a trick that enables you to easily work with distributions that have many components. As with all
The CUSUM test has many incarnations. Different areas of statistics use different assumption and test for different hypotheses. This article presents a brief overview of CUSUM tests and gives an example of using the CUSUM test in PROC AUTOREG for autoregressive models in SAS. A CUSUM test uses the cumulative
This article demonstrates the ODS Excel destination’s flexibility and how you can modify its default behavior by using the SHEET_INTERVAL= option.
While we're on the topic of mortgage rates, let's explore another technique for plotting and comparing the rate data over several years. Last time, we plotted each year's data in a separate graph, and paneled them across the page. This time, let's overlay multiple years together in the same graph.
By using data provided by a Game of Thrones fan, we use SAS to look at screen time for scene locations and characters in this crazy popular show.
I think every course in exploratory data analysis should begin by studying Anscombe's quartet. Anscombe's quartet is a set of four data sets (N=11) that have nearly identical descriptive statistics but different graphical properties. They are a great reminder of why you should graph your data. You can read about
SAS Global Forum 2019 (SGF) is rapidly approaching - and which of the hundreds of presentations are you planning to attend? Well, no matter what types of analyses you perform with SAS software, you'll most likely want to present your findings in a really nice/informative graph! Therefore I highly recommend
I recently saw an interesting graph that showed the number of motor vehicle crash deaths has been going down. The graph showed deaths per mile. That's a good statistic, but I wondered whether there were other ways to look at the data? An Interesting Graph Here's the graph, from an
During the year 2020, many countries and areas will be conducting their decennial census, and making projections to estimate what their population will be in the future. Therefore I decided to dust off one of my old SAS/Graph samples based on the 2010 census, and rewrite it using more modern
Flooding has been in the news the past few days, and that makes me want to analyze some data! I hang out at Jordan Lake (here in central North Carolina) a lot, so I decided to download the data for that lake, and do a graphical analysis. If you're interested
An analyst was using SAS to analyze some data from an experiment. He noticed that the response variable is always positive (such as volume, size, or weight), but his statistical model predicts some negative responses. He posted the data and asked if it is possible to modify the graph so
Here in the US, there's a lot of talk about the flu each year. First, people discuss whether or not to get the flu shot. Then there are discussions about whether or not you or your friends have the flu (or something else). Then the discussions about what strain of
Statisticians often emphasize the dangers of extrapolating from a univariate regression model. A common exercise in introductory statistics is to ask students to compute a model of population growth and predict the population far in the future. The students learn that extrapolating from a model can result in a nonsensical
US farmers grow a lot of food ... but did you know some of them also grow fuel for our vehicles? Follow along and you'll learn how much fuel they grow, and also learn some tips about plotting this type of data! These days most gasoline in the US has
It's time to celebrate Pi Day! Every year on March 14th (written 3/14 in the US), math-loving folks celebrate "all things pi-related" because 3.14 is the three-decimal approximation to the mathematical constant, π. Although children learn that pi is approximately 3.14159..., the actual definition of π is the ratio of
According to the most recent data, the child poverty rate in China is 33.1% - the rate in Denmark is 2.9%. Where do other countries fall in between these two extremes? Let's build a graph and find out! (or, if you're not interested in the code - jump to the
A previous article shows how to use a scatter plot to visualize the average SAT scores for all high schools in North Carolina. The schools are grouped by school districts and ranked according to the median value of the schools in the district. For the school districts that have many
Standardized tests like the SAT and ACT can cause stress for both high school students and their parents, but according to a Wall Street Journal article, the SAT and ACT "provide an invaluable measure of how students are likely to perform in college and beyond." Naturally, students wonder how their
Box plots are a great way to compare the distributions of several subpopulations of your data. For example, box plots are often used in clinical studies to visualize the response of patients in various cohorts. This article describes three techniques to visualize responses when the cohorts have a nested or
I have written several blog posts about longevity, and here is another one related to that topic. Cardiovascular disease (cvd) is one of the more common causes of death, and I was wondering how those numbers have changed over time. Are fewer people dying from cvd, or are more people
Maybe if we think and wish and hope and pray It might come true. Oh, wouldn't it be nice? The Beach Boys Months ago, I wrote about how to use the EFFECT statement in SAS to perform regression with restricted cubic splines. This is the modern way to use splines
Beginning with SAS® 9.4, you can embed graphics output within HTML output using the ODS HTML5 destination. This technique works with SAS/GRAPH® procedures (such as GPLOT and GCHART), SG procedures (such as SGPLOT and SGRENDER), and when you create graphics output with ODS Graphics enabled. Most (if not all) existing
With the US census coming in 2020, I've decided to sharpen my skills at graphing census data. And today I'm working on creating a population pyramid chart to analyze the age and gender distribution. Follow along if you'd like to see how to create such a chart ... or jump
I previously discussed how you can use validation data to choose between a set of competing regression models. In that article, I manually evaluated seven models for a continuous response on the training data and manually chose the model that gave the best predictions for the validation data. Fortunately, SAS
Machine learning differs from classical statistics in the way it assesses and compares competing models. In classical statistics, you use all the data to fit each model. You choose between models by using a statistic (such as AIC, AICC, SBC, ...) that measures both the goodness of fit and the
Many areas of the US are experiencing record low unemployment. This is great at the national level, and also great at a personal level (for example, I now have fewer unemployed friends asking to borrow money!) But just how low is the US unemployment rate, and how does it compare
As we're approaching the anniversary of Hans Rosling's passing, I fondly remember his spectacular graphical presentations comparing the wealth and health of nations around the world. He certainly raised the bar for data visualization, and his animated charts inspired me to work even harder to create similar visualizations! What better way