I've previously written about how to generate a sequence of evenly spaced points in an interval. Evenly spaced data is useful for scoring a regression model on an interval.
In the previous articles the endpoints of the interval were hard-coded. However, it is common to want to evaluate a function in the interval [min(x), max{x}], where x is an observed variable in data set. That is easily done in the DATA step by first running a PROC SQL call that puts the minimum and maximum values into macro variables. The macro variables can then be used to generate the evenly spaced values. For example, the following statements generate 101 points within the range of the Weight variable in the Sashelp.Cars data set:
/* Put min and max into macro variables */ proc sql noprint; select min(Weight), max(Weight) into :min_x, :max_x from Sashelp.Cars; quit; /* Create data set of evenly spaced points */ data ScoreX; do Weight = &min_x to &max_x by (&max_x-&min_x) / 100; /* min(X) to max(X) */ output; end; |
See the article "Techniques for scoring a regression model in SAS" for various SAS procedures and statements that can score a regression model on the points in the ScoreX data set.
You can also use this technique to compute four macro variables to use in generating a uniformly spaced grid of values.
If you are doing the computation in the SAS/IML language, no macro variables are required because you can easily compute the minimum and maximum values as part of the program:
proc iml; use Sashelp.Cars; read all var "Weight" into x; close; w = do(min(x), max(x), (max(x)-min(x))/100); /* min(X) to max(X) */ |
6 Comments
hi rick
how about evenly spaced permuations (for whatever metric you like)? for fast "exact" calculations.
If you have 18 or fewer points, you can use the ALLPERM function to generate all permutations and do exact tests. Otherwise, you'll have to use the RANPERM function to generate lots of random permutations and approximate the sampling distribution. Both of these functions are discussed in the article "Generate permutations in SAS." In case you wonder why 18 is the cutoff value, see "On the number of permutations supported in SAS software."
I was thinking in terms of a systematic sample for larger n, which is what your example is for one-d. Think of a large counterbalanced (e.g. Williams Design) latin square for, say, n=100 or 1000. Ideally, would like a solution where every permutation is within some bounded distance of the sampled points. In my retirement, I'm pursuing an interest in exact distributions.
Sounds like a question for the SAS Statistical Procedures Support Community. Maybe PROC OPTEX?
Pingback: Error distributions and exponential regression models - The DO Loop
Pingback: Grids and linear subspaces - The DO Loop