~Contributed by Philip Busby, Applications Developer at SAS (@Philihp)~
My mind was blown just now at Paul Gorrell's talk on numeric values within SAS. The nice thing that hooks new programmers to SAS is how easy it is to do so many things, but what I find really makes a master SAS programmer is knowing what it is exactly that happens behind the curtain. Paul's talk offered a glimpse into that when it comes to numeric variables.
One way to reduce the size of a dataset is to reduce the size of the numeric variables, but in picking the correct size, one must consider not only the range of the observations for that variable, but also the precision. You may have seen this table before - it's a little deceptive.
|Length in Bytes||Largest Integer Represented Exactly|
It's possible to have values higher than these in the table, but keep in mind the precision is lost. It may not be immediately obvious because during the processing of the data step, numerics are expanded to take up a full 8 bytes.
length n1 n2 3.; n1 = 32768; n2 = 32769; put n1= n2=; run;
n1=32768 n2=32768but then reading back in those values...
data; set ds; put n1= n2=; run;
n1=32768 n2=32768 which wouldn't be expected unless one understands what's going on behind the scenes. [It rounds to 32768 here, because nearly all computers will round to the nearest even number to avoid bias.]
Read Gorrell's paper, Numeric Length in SAS®: A Case Study in Decision Making for more about this subject.
These code examples are provided as is, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.