Data Analysis Archives

Analytics | Learn SAS

Rick WicklinNovember 18, 2024 0

The correlation between two sets of variables

In a correlation analysis, it is common to consider the correlations between all pairs of numerical variables. That is, if there are k numerical variables, most people examine the complete k x k matrix of correlations. This matrix is symmetric and has 1s on the diagonal, so more than half of the

English

Advanced Analytics | Learn SAS

Rick WicklinNovember 11, 2024 0

Introducing PROC SIMSYSTEM in SAS Viya

When the SAS Global Forum 2020 conference was cancelled by the global COVID-19 pandemic, I felt sorry for the customers and colleagues who had spent months preparing their presentations. One presentation I especially wanted to attend was by Bucky Ransdell and Randy Tobias: "Introducing PROC SIMSYSTEM for Systematic Nonnormal Simulation".

English

Analytics | Programming Tips

Rick WicklinOctober 21, 2024 1

The correlogram: Visualize correlations by fitting angles

A common way to visualize the sample correlations between many numeric variables is to display a heat map that shows the Pearson correlation for each pair of variables, as shown in the image to the right. The correlation is a number in the range [-1, 1], where -1 indicated perfect

English

Learn SAS | Programming Tips

Rick WicklinSeptember 30, 2024 2

Programming the formulas for an ANOVA in SAS

In practice, there is no need to remember textbook formulas for the ANOVA test because all modern statistical software will perform the test for you. In SAS, the ANOVA procedure is designed to handle balanced designs (the same number of observations in each group) whereas the GLM procedure can handle

English

Learn SAS | Programming Tips

Rick WicklinSeptember 9, 2024 2

The location of ticks in statistical graphics

Modern software for statistical graphics automatically handles many details and graph defaults, such as the range of the axes and the placement of tick marks. In the days of yore, these details required tedious manual calculations. Think about what is required to place ticks on a scatter plot. On the

English

Learn SAS | Programming Tips

Rick WicklinSeptember 4, 2024 2

Is a value in a vector? Use the ELEMENT function

In SAS, DATA step programmers use the IN operator to determine whether a value is contained in a set of target values. Did you know that there is a similar functionality in the SAS IML language? The ELEMENT function in the SAS IML language is similar to the IN operator

English

Analytics | Learn SAS | Programming Tips

Rick WicklinJuly 15, 2024 4

Isotonic regression: An application of quadratic optimization

Isotonic regression (also called monotonic regression) is a type of regression model that assumes that the response variable is a monotonic function of the explanatory variable(s). The model can be nondecreasing or nonincreasing. Certain physical and biological processes can be analyzed by using an isotonic regression model. For example, a

English

Learn SAS | Programming Tips

Rick WicklinJune 24, 2024 2

Teaching an AI assistant to read and write SAS IML vectors

One of the most exciting features of SAS Viya Workbench is that the code editor includes a generative AI component called SAS Viya Copilot. This feature was announced and demonstrated at SAS Innovate 2024. With the Copilot, you can specify a text prompt that generates SAS code. For example, you

English

Analytics | Data Visualization

Rick WicklinJune 19, 2024 0

Scale a density curve to match a histogram

This article discusses how to scale a probability density curve so that it fits appropriately on a histogram, as shown in the graph to the right. By definition, a probability density curve is scaled so that the area under the curve equals 1. However, a histogram might show counts or

English

Analytics | Data Visualization | Learn SAS

Rick WicklinJune 17, 2024 5

A bootstrap confidence interval for an R-square statistic

A previous article discusses a formula for a confidence interval for R-square in a linear regression model (Olkin and Finn (1995) "Correlations redux", Psychological Bulletin) The formula is useful for large data sets, but should be used with caution for small samples. At the end of the previous article, I

English

Blogs

Blogs

Tag: Data Analysis