All Posts

Advanced Analytics
Rick Wicklin 0
What is Mahalanobis distance?

I previously described how to use Mahalanobis distance to find outliers in multivariate data. This article takes a closer look at Mahalanobis distance. A subsequent article will describe how you can compute Mahalanobis distance. Distance in standard units In statistics, we sometimes measure "nearness" or "farness" in terms of the

Data Visualization
Sanjay Matange 0
Beer, diapers and heat map

The parable of beer and diapers is often related when teaching data mining techniques.  Whether fact or fiction, a Heat Map is useful to view the claimed associations.  A co-worker recently enquired about possible ways to display associations or dependency between variables.  One option is to show the dependency as a node

Advanced Analytics
Waynette Tubbs 0
Friday's Innovation Inspiration - Big data

Big data is one of the hottest topics in business. When you hear those words - BIG Data - you almost surely think of: HUGE financial services firms scoring terabytes of historical and current risk data GLOBAL telecommunications companies mining petabytes of structured and unstructured data INTERNATIONAL retailers repricing hundreds of thousands of products across

Analytics
Melissa Savage 0
Analytics helping transportation officials get the job done in tight financial times

The American Association of State Highway Transportation Officials recently released a top 10 list of transportation issues that will be “talked, written or tweeted and legislated about” in 2012.   As expected, funding constraints and Congressional action on reauthorization appear on the list but the group also notes that natural disaster

Advanced Analytics
Rick Wicklin 0
Use the Cholesky transformation to correlate and uncorrelate variables

A variance-covariance matrix expresses linear relationships between variables. Given the covariances between variables, did you know that you can write down an invertible linear transformation that "uncorrelates" the variables? Conversely, you can transform a set of uncorrelated variables into variables with given covariances. The transformation that works this magic is

Data Visualization
Sanjay Matange 0
Comparative density plots

Recently a user posted a question on the SAS/GRAPH and ODS Graphics Communities page on how to plot the normal density curves for two classification levels in the same graph. We have often seen examples of a  distribution plot of one variable using a histogram with normal and kernel density curves.  Here is a simple example: Code Snippet:

1 673 674 675 676 677 711