According to the Daily Writing Tips blog, describing a thing as "somewhat unique" is bad form. Unique means "one of a kind", so either it is or it is not. The famous example (which the style police will use to chide you) is that you can't have something as "somewhat unique" any more than a woman can be "somewhat pregnant".
But when programming and dealing with computers, we often deal with the concept of "unique enough". And to fill that niche, we have the Globally Unique Identifiers, or GUIDs. (Most people pronounce this "GOO-ids".) These are also called UUIDs, for "universal unique identifiers". (But the Universe is pretty big, isn't it? Therefore using this term makes a pretty bold claim.)
A GUID is a 128-bit value, allowing for 2128 (or 3.4028237e+38) possibilities. Computer processes generate GUIDs behind the scenes all of the time for lots of purposes, and among geeks we joke that they will become a scarce resource, like helium. But rest assured: it will take us a long time to go through all of the possible GUID values.
When we see GUID values in the wild, they are usually expressed as a series of 32 hexadecimal characters, often containing hyphens like a crazy sort of phone number (as if you might attempt to commit it to memory):
efb40385-6b5c-4e7f-9f19-1daeb7e97ed9
I created the above GUID fresh for this blog (it's never been seen before!) using a tool on my PC called uuidgen, which is part of the Windows SDK.
In the SAS world, I can use the UUIDGEN function to create these as needed from within my SAS program:
data wastedGuids; do x=1 to 10; guid = uuidgen(); output; end; run; |
The UUIDGEN function relies on code libraries that are system-dependent, as there are different algorithms for creating GUID values. When I run the above program on a Linux version of SAS, I get a different pattern of results:
I suppose -- now that I've promoted this function on the blog -- we'll have SAS programmers cranking GUIDs out day-and-night. At this rate, will there be any GUIDs left for our grandchildren? Well, I can't worry about that -- live for today, I say.
Bonus joke: How do you catch a unique rabbit? (Reply in the comments, please.)
14 Comments
You’ve got me hooked! I’ll take a couple of 100M of these GUIDS. Forget about incrementing keys when upserting tables. You’ve just added a new requirement for DIS R&D.
Your joke though is a true challenge. I’ve looked it up on Internet but my English is not good enough to get the answer (at least I hope it’s my English).
Thanks for another of your great blogs.
I had to search for the punchline to the joke, and will leave it for others to find.
I've been wasting UUIDs for years - we have a macro that sends SAS log output to a file using Proc Printto. As well as a macro variable to describe the process being logged (_out), the log file names contain the first eight characters returned from a call to uuidgen() so that re-running code doesn't replace log files generated from earlier runs:
Eight hex characters give me unique enough filenames and lets me 'neek up on any pesky errors.
You'd catch a unique rabbit the same way as any rabbit: they are all unique; unless they've developed cloning in the burrows.
unique up on it.
Hurrah! Correct!
Chris, thank you so much. This works great for generating surrogate keys for my tables.
Thanks Chris, I used the info in your post to create a somewhat unique tablename:
How do you catch a tame rabbit?
The tame way.
You don't catch a unique rabbit.
You catch an exceptional one.
More of a c#,c++ joke.
Well, you can Try. Ha!
I've tried your method and guess what... it works! Thank you very much Chris!
When I use this, the length automatically is set to 200 for the guid field. Just curious as to why?
200 is the default length for a character variable in SAS. If I added a LENGTH statement we could restrain that:
Thanks!