Jack Shostak is the Associate Director of Statistics at the Duke Clinical Research Institute. A SAS user since 1985, Jack has two SAS books under his belt with a third on the way. This week's SAS tip is from Shotak's SAS Programming in the Pharmaceutical Industry.
The following excerpt is from SAS Press author Jack Shostak and his book "SAS Programming in the Pharmaceutical Industry" Copyright © 2005, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED. (please note that results may vary depending on your version of SAS software)
Defining Variables Once
One of the primary reasons for creating analysis data sets is to have variable derivations in a single place. If a variable is defined in a single analysis data set, then the following are true:
-
The variable is defined consistently.
-
Any reviewer of the data can easily find the derivation.
-
Any programmer can easily maintain the derivation.
-
The derivation can be readily verified.
Some statistical programmers or statisticians may advocate placing new variable derivations within individual analysis or summary programs. I believe this is a practice that should be avoided. Imagine if you wanted to change the derivation of a single variable but you had to search 100 programs to find it and subsequently change it 100 times. Also, if the variable is defined 100 times, then the odds that it is defined the same way across 100 programs are low. Finally, if the derivation is stored once in a permanent analysis data set, then it can be verified easily. The same cannot be said of a variable that is derived in a summary program that disappears from memory when the SAS program terminates.
Read a free chapter from the book and user reviews here. And sign up to receive notification when Jack Shostak's forthcoming book Implementing CDISC Using SAS: An End-to-End Guide becomes available. You can also view his previously featured SAS tip - Using PROC FREQ to export descriptive statistics.
1 Comment
Thanks Jack for this tip! I totally agree.
Sunil