A previous article discusses the geometry of weighted averages and shows how choosing different weights can lead to different rankings of the subjects. As an example, I showed how college programs might rank applicants by using a weighted average of factors such as test scores. "The best" applicant is determined
Search Results: sgplot (964)
Many cities have Open Data pages. But once you download the data, what can you do with it? This is my second blog post where I download several datasets from Cary, NC's open data page, and and give you a few ideas to get you started on your own data
What happens if you need to edit graph output files from SAS in a different application (for example, Microsoft Word)? It is not recommended that you edit your SAS graph output outside of SAS, but, if you must do so, you need to create your graphics output as EMF (Enhanced Metafile Format) graph output.
In a previous article, I discussed a beautiful painting called "Phantom’s Shadow, 2018" by the Nigerian-born artist, Odili Donald Odita. I noted that if you overlay a 4 x 4 grid on the painting, then each cell contains a four-bladed pinwheel shape. The cells display rotations and reflections of the pinwheel. The
Art evokes an emotional response in the viewer, but sometimes art also evokes a cerebral response. When I see patterns and symmetries in art, I think about a related mathematical object or process. Recently, a Twitter user tweeted about a painting called "Phantom’s Shadow, 2018" by the Nigerian-born artist, Odili
Some claim that deaths in the US have been increasing, and some claim they have been decreasing. Which do you think is correct? Let's take a look at the data ... The Data Here in the US, the Centers for Disease Control and Prevention is a good/official source of data
How long do dogs live? ... That's a good/tough question. Some live longer than others, but what are the determining factors? Let's throw some data to this problem, and see if we can fetch some answers! But before we get started, how about a random picture to get you into
After my recent articles on simulating data by using copulas, many readers commented about the power of copulas. Yes, they are powerful, and the geometry of copulas is beautiful. However, it is important to be aware of the limitations of copulas. This article creates a bizarre example of bivariate data,
Who's to say that 'north' should always be at the top of a map? Perhaps in certain situations, you might want 'south' (or some other direction) to be at the top. Perhaps you're one of our crazy Australian customers who looks at the world a little differently. Well, whatever the
This article shows how to estimate and visualize a two-dimensional cumulative distribution function (CDF) in SAS. SAS has built-in support for this computation. Although the bivariate CDF is not used as much as the univariate CDF, the bivariate version is still a useful tool in understanding the probable values of
This article uses simulation to demonstrate the fact that any continuous distribution can be transformed into the uniform distribution on (0,1). The function that performs this transformation is a familiar one: it is the cumulative distribution function (CDF). A continuous CDF is defined as an integral, so the transformation is
Having earned the Eagle Scout rank in Boy Scouts, I am of course very conservation-minded, and against polluting. I'm also an avid boat paddler and fisherman, and therefore I'm especially concerned about pollution in our rivers, lakes, and oceans. I even volunteered for a week to help survey coral reefs
I've got this buddy, Carter Johnson - he's a little bit crazy, but a lot of fun to follow... He holds/held several different long-distance paddling world records, and was one of the coaches for the group that paddled kayaks from Cuba to the US (see my blog post). A few
You've probably seen a population pyramid, such as this one I showed in a previous blog post. But let's scrutinize population pyramids a bit deeper, with an eye on special features that can make them even more useful! I was inspired to give population trees a second look by this
It is well known that classical estimates of location and scale (for example, the mean and standard deviation) are influenced by outliers. In the 1960s, '70s, and '80s, researchers such as Tukey, Huber, Hampel, and Rousseeuw advocated analyzing data by using robust statistical estimates such as the median and the
When data contain outliers, medians estimate the center of the data better than means do. In general, robust estimates of location and sale are preferred over classical moment-based estimates when the data contain outliers or are from a heavy-tailed distribution. Thus, instead of using the mean and standard deviation of
A previous article discusses the definition of the Hoeffding D statistic and how to compute it in SAS. The letter D stands for "dependence." Unlike the Pearson correlation, which measures linear relationships, the Hoeffding D statistic tests whether two random variables are independent. Dependent variables have a Hoeffding D statistic
Cindy Wang's curiosity about the Mandelbrot set led her to draw one using SAS Visual Analytics.
There are many statistics that measure whether two continuous random variables are independent or whether they are related to each other in some way. The most well-known statistic is Pearson's correlation, which is a parametric measure of the linear relationship between two variables. A related measure is Spearman's rank correlation,
Ranking is a fundamental concept in statistics. Ranks of univariate data are used by statisticians to estimate statistics such as percentiles (quantiles) and empirical distributions. A more advanced use is to compute various rank-based measures of correlation or association between pairs of variables. For example, ranks are used to compute
Have you ever brought home a piece of furniture-in-a-box, and felt undue stress while trying to make sense of the directions to assemble it? ... Apparently you're not alone! A recent analysis studied ~50,000 tweets about IKEA furniture, and determined whether the people posting the tweets were frustrated. They then
Most introductory statistics courses introduce the bar chart as a way to visualize the frequency (counts) for a categorical variable. A vertical bar chart places the categories along the horizontal (X) axis and shows the counts (or percentages) on the vertical (Y) axis. The vertical bar chart is a precursor
A previous article discusses how to interpret regression diagnostic plots that are produced by SAS regression procedures such as PROC REG. In that article, two of the plots indicate influential observations and outliers. Intuitively, an observation is influential if its presence changes the parameter estimates for the regression by "more
This article shows how to use PROC SGPLOT in SAS to create the scatter plot shown to the right. The scatter plot has the following features: The colors of markers are determined by the value of a third variable. The outline of each marker is the same color (such as
Linear programming (LP) and mixed integer linear programming (MILP) solvers are powerful tools. Many real-world business problems, including facility location, production planning, job scheduling, and vehicle routing, naturally lead to linear optimization models. Sometimes a model that is not quite linear can be transformed to an equivalent linear model to reduce
Here is an interesting math question: How many reduced fractions in the interval (0, 1) have a denominator less than 100? The question is difficult is because of the word "reduced." If we only care about the total number of fractions in (0,1) whose denominator is less than 100, we
This is another in my series of blog posts where I take a deep dive into converting customized R graphs into SAS graphs. Today we'll be working on bar charts ... And to give you a hint about what data I'll be using this time, here's a picture from a SAS
A SAS customer wanted to compute the cumulative distribution function (CDF) of the generalized gamma distribution. For any continuous distribution, the CDF is the integral of the probability density function (PDF), which usually has an explicit formula. Accordingly, he wanted to compute the CDF by using the QUAD function in
This is my Pi Day post for 2021. Every year on March 14th (written 3/14 in the US), geeky mathematicians and their friends celebrate "all things pi-related" because 3.14 is the three-decimal approximation to pi. Most years I write about lower-case pi (π), which is the ratio of a circle's
I recently learned about a new feature in PROC QUANTREG that was added in SAS/STAT 15.1 (part of SAS 9.4M6). Recall that PROC QUANTREG enables you to perform quantile regression in SAS. (If you are not familiar with quantile regression, see an earlier article that describes quantile regression and provides